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IMAGE PROCESSING DEVICE, IMAGE PROCESSING METHOD, AND 

RECORDED MEDIUM 

FIELD OF THE INVENTION 

The present invention relates to an image processing device and 
image processing method for reducing the memory size of image data 
inputted by the scanner, digital camera etc. or image data sent through 
communication means such as the Internet and outputting those data 
clearly 

BACKGROUND OF THE INVENTION 

With technical innovation in recent years, electronic photographing 
apparatuses have come to be put to practical use that memorize images on 
memory media such as magnetic media or semiconductor memory instead 
of silver film. Especially, the digital camera that reads out the density 
and color using the charge coupled devices (CCD) element and memorizes 
them on a semiconductor memory or the like has found its way into homes 
because of the handiness and low price. The resolution of the CCD 
element is being upgraded from 1,000,000 picture elements to 2,000,000 
picture elements. Depending on the number of effective picture elements, 
the CCD element has a resolution capacity of approximately 1,280 picture 
elements x 980 picture elements, and the image data size is very large and 
it is necessary to compress the image data. For the still image 
compression, the JPEG (Joint Photographic Coding Experts Group) 
standard is generally adopted. The prior art image processing device for 
data compression by this JPEG method is shown in a block diagram in FIG 



An original image is read by an image input means 10 formed of 



39. 
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CCD elements etc. On the original image, lossy compression 
(non-reversible compression) by the JPEG is carried out by compression 
means 3900. 

That is, discrete cosine transform of the original image is performed by 
discrete cosine transform (DCT) means 3906 whereby the original image is 
transformed into signals in the frequency space, and the obtained 
transform coefficient is quantized by quantization means 3907 using a 
quantization table 3901. The results of this quantization are transformed 
into a code string by entropy encoding means 3908 on the basis of an 
entropy encoding table 3902, and this code string is stored on a storage 
medium 15. This processing is continued until after the compression of all 
the original images is over. In this connection, a lossless compression 
(reversible compression) method in which an image can be restored 
without distortion is also proposed for the JPEG standard. When the 
lossress compression is used, the compression ratio, which is defined as 
the ratio of original image size to the compressed image size, is very low. 
Therfore, lossy compression is generally used. But when lossy compression 
is used, the original image are not exactly reconstructed because of the 
quantization error as in FIG. 39 and the rounding error by DCT. Of the 
two reversible factors, the quantization error has bad influence on the 
quality of the reconstructed image especially. 

Meanwhile, image processing devices such as the digital camera 
are generally provided with a display unit such as liquid crystal display to 
review a photographed image right there, to retrieve the image for data 
editing etc. Then, the resolution of CCD is high as compared with the 
resolution of the display unit for review, and when an compressed image 
data stored on the storage medium 15 is displayed on those display units, 



this compressed image data is expanded and the thinning out of picture 
elements is conducted to transform the resolution. 

First, expanding means 3903 in FIG. 39 expands a compressed 
image data stored on the storage medium 15. In this process, the 
coefficient of entropy-encoded image signal is brought back to the 
quantized transform coefficients by entropy decoding means 3911 and DCT 
coefficient by inverse quantization means 3910. Then, on this DCT 
coefficient, inverse discrete cosine transforming (IDCT) is performed by 
IDCT means 3909 to expand the data to the value of the original image 
data space. This expanded image data is outputted on the printer, CRT 
etc. by Original Image (01) output means 3912 in accordance with an 
instruction signal inputted by the user through the mouse or the like (not 
shown). 

On the other hand, means 3904 for thinning out images thins out 
picture elements from the all image data expanded by expanding means 
3903 according to the resolution at means 3905 for displaying reduced 
image for review display, and this thinned out image data is displayed for 
review by means 3905 for displaying reduced image. 

Now, to think of reading an original data of JIS paper size A4, for 
example, using an digital camera having CCD with 2,000,000 picture 
elements, it is clear that the digital camera has a resolution of only some 
100 (dots/inch) if the resolution is estimated from the number of picture 
elements of the CCD element, though that is not always applicable 
because of the focus distance and the like. This resolution is far more 
coarse than that in an image obtained on the silver film, and the 
expression is still insufficient with regard to the edge information and 
clearness of details in a natural image. 

For this reason, when an image inputted by the digital camera is 



ontputted on high-resolution printers like the laser printer and ink jet 
printer or is displayed on a high-resolution display connected to the 
computer for data editing, the low-resolution image has to be transformed 
into a high-resolution data. In recent years, in this connection, the 
5 sending of image data by communication means such as the Internet has 
been getting popular, but since the communication speed and the data size 
that can be sent at a time are limited, image data are lowered in 
resolution before being sent in many cases. Because of that, the image 
77 data sent tends to become an image with blurred details when displayed 

%l0 on the receiver's side. And there is a mounting call among the users that 
;ij low-resolution images sent by the Internet etc. should be upgraded in 

1^ resolution. It is noted that high resolution transform and enlargement of 

image are synonymous, and there will be explained the conventional 
W methods of enlarging an image with precision. 

pl5 The conventional methods of enlarging an image are roughly 

classified into the following two techniques- 

(1) Merely interpolating between the picture elements. 

(2) Enlarging an image after transforming the image data in the real space 
into an image data in the frequency space using the orthogonal transform 

20 techniques such as FFT (fast Fourier transform) or DCT as described in 
unexamined Japanese patent applications 2-76472 and 5-167920. 

First, a conventional system example to realize the technique of 
interpolating between the picture elements in (l) is shown in a block 
diagram in FIG. 42. On an original image obtained by image input means 

25 10, the picture elements are linear interpolated between the picture 

elements in accordance with the following Formula 1. In formula 1, Da is 
picture element data at point A, Db is picture element data at point B, Dc 
is picture element data at point C, and Dd is picture element data at point 
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D. De is picture element data at point E to be worked out. 
Formula 1 

De = (1 - 11) x (1 - v) x Da + \i x (l - v) x Db 
+ (l - \i) x v x Dc + \i x v x Dd 

Since an interpolated image data tends to be an averaged one and 
to blur in edge etc., the interpolated picture element position and 
interpolated data using edge information extracted from the original 
image is often adjusted. In FIG. 42, edge information is extracted by 
edge extracting means 4200, and is enlarged to a desired enlargement 
ratio by edge interpolation means 4201 to obtain an enlarged edge. And 
the interpolated image of the original image by picture element 
interpolating means 1100 are convoluted by this enlarged edge information 
so that a desired enlarged image is generated and outputted by Enlarged 
Images (EI) output means 501. In addition, there are interpolation 
methods such as the nearest neighbor method in which the value of the 
nearest sample is taken as interpolation value. 

But the interpolation methods mentioned above have some 
problems. In the linear interpolation method, the frequency 

characteristics in the pass band width is suppressed and the image data is 
smoothed as if the data is passed through a low pass filter (LPF), so that 
the image tends to blur with insufficient sharpness and expression of 
details. The problem with the nearest neighbor method is that much lack 
of high frequency components occurs and that tends to cause distortion of 
the enlarged image, for example, jaggy in edge area and blurred mosaic 
distortion. 

The technique in (2) using the spatial frequency is proposed as a 
solution to that problem. What those techniques aim at is to restore the 
high frequency component lost in data sampling and restore information 



on details and edges of the image accurately, thereby improving the visual 
quality of the enlarged image. 

FIG. 40 is a block diagram of a conventional system example to 
realize the technique in (2) that uses spatial frequency, while FIG. 41 
schematically shows the processing procedure. First, an original image (n 
x n picture elements) in the real space as shown in FIG. 41 (a) is 
orthogonally transformed into an image (n x n picture elements) in the 
frequency space as shown in FIG. 41 (b). The image data in this 
frequency space is expressed in an n x n matrix, and the matrix obtained 
by this frequency transform shows lower frequency component as the 
position moves nearer the upper left of the figure and shows higher 
frequency component as the position moves in the right direction and in 
the downward direction along the arrows. 

Then in "0" component embedding means 4000, a s-fold area — an 
area of sn x sn shown in FIG. 41 (c) - of the transformed image in the 
frequency space is prepared. In the part of low frequency components in 
the sn x sn area, the frequency area of n x n shown in FIG. 41 (b) that is 
obtained by the orthogonal transform is copied, while the remaining part 
of high frequency component is interpolated with "0". Finally, in inverse 
orthogonal transform means 4001, this frequency area of sn x sn is 
inverse-orthogonally transformed whereby a s-fold image data in the real 
space as shown in FIG. 41 (d) is obtained and an estimated enlarged image 
is outputted by Estimated Enlarged Image (EsEI) output means 4002 in 
FIG. 40. 

In addition to that method in which the high frequency component 
is interpolated with "0", a technique is proposed in which the high 
frequency components are restored in a process of repeating transform and 
inverse transform of image data using orthogonal transform (method 
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involving Gerchberg-Papoulis repeat) as disclosed in unexamined Japanese 
patent application 6-54172. Furthermore, a method as disclosed in 
unexamined Japanese patent application 8-294001 is proposed in which 
the orthogonal transform components in the original image are embedded 
in the low frequency area of the enlarged image, while the high frequency- 
area is embedded with frequency information obtained on the basis of a 
prediction rule prepared in advance. 

In the conventional image processing devices such as the digital 
camera, however, the image data once compressed and stored has to be 
expanded again, and in addition, the compression is performed by an lossy 
compression involving quantization in JPEG, and for this reason, the 
original image data not be reconstructed from the compressed image, and 
it often happens that some noise and color difference are caused. 

Another problem is that color difference and Moire are caused by 
the thinning out at the time of resolution transform necessary for 
generation of a reduced image (hereinafter thumbnail image) to be 
displayed for review and therefore the smoothing by median filter or mean 
value filter is indispensable. But another problem is pointed out. That 
is, if the median filter or the mean value filter is used, the sharpness of 
the thumbnail image will be lost and the smoothing takes much time. 

Furthermore, the following problems are encountered with the 
conventional image enlargement required when a low-resolution image by 
the digital camera is printed or displayed. 

First, in the case of method (l) in which merely interpolation is 
performed between the picture elements, it often happens that when an 
natural image is photographed by such a low-resolution device, the edge is 
not extracted with precision from the original image in the system shown 
in FIG. 42. As a result, the position where picture elements are 



interpolated and the interpolated data can not be adjusted welL To avoid 
this, an enlarged image is subjected to a repeat of processing by 
edge-enhancement filter. But the problem with the conventional 
edge-enhancement filter processing is that unless a right filter is selected, 
the processing has to be repeated many times, or the edge is not enhanced 
at all. 

In method (2) in which the image is enlarged by embedding "0" in 
the high frequency area, on the other hand, a better enlarged image can be 
obtain than in the technique involving interpolating between the picture 
elements like the linear interpolation method and the nearest neighbor 
method. But since the high frequency components erased in 
data-sampling is not restored well, no sufficiently sharp image can be 
obtained. 

In the method in which the high frequency components are 
restored in the process of repeating right transform and inverse transform 
of an image by orthogonal transform, right transform and inverse 
transform are repeated, and that increases arithmetic processing. That is, 
the problem is the processing speed. The arithmetic processing amount is 
not so troublesome if the enlargement ratio s is not so large, but if the 
enlargement ratio s is large, the arithmetic processing amount of inverse 
transform as opposed to the arithmetic processing amount of right 
transform increases approximately in proportion to s x n. Especially in 
the two-dimensional processing actually performed, the arithmetic 
processing amount increases roughly in proportion to the cube of s x n. 
Especially in enlargement of a color image, the enlargement for a plurality 
of color components will be necessary, further increasing the time needed 
for the processing. Furthermore, in case the image to be enlarged is low 
in resolution, the high frequency components will not be restored to the 
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accurately. 

The method disclosed in unexamined Japanese patent application 
8-294001 takes those problems into consideration and considers the 
processing time and restoration of the high frequency components. This 
5 method involves embedding frequency information obtained on the basis of 
a prediction rule prepared in advance in the high frequency area of the 
original image. Therefore, it is necessary to work out a rule between the 
high frequency components and the other areas on the basis of a large 
number of picture samples in advance. It takes much labor to prepare a 

10 proper rule base, and if that can not be made into a proper rule, there is 
fear that it can not obtain sufficient effects. 

In addition, the image size is generally arbitrary, and the larger 
the size for orthogonal transform, the longer the processing is. Therefore, 
it is usual that the whole of an image of a specific size is not put to 

15 orthogonal transform at a time, but orthogonal transform is performed by 
blocks of a size of 4 picture elements to 16 picture elements. The problem 
is that it can happen that discontinuity between the blocks (block 
distortion) occurs in the border portion in an enlarged image. 

20 SUMMARY OF THE INVENTION 

In view of those problems as set forth above, the present invention 
has been made, and it is an object of the present invention to provide an 
image processing device and image processing method which permit 
reduction of memory size of image data and enlargement of an image to a 
25 sharp and high-quality image. 

To achieve the object, the following means are adopted in the 
present invention. 

First, to reduce the memory size of an image data, as shown in FIG. 
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1, an original image obtained from image input means 10 is orthogonally 
transformed by Original Images (01) orthogonal transforming mean 11 so 
as to generate the frequency components of the original image, and the low 
frequency components are extracted from the frequency components of the 
original image by Low Frequency Components (LFC) extracting means 100. 
Then, High Frequency Components (HFC) encoding means 13 works out 
the relation information between the low frequency components and the 
remaining high frequency components in the frequency components of the 
original image and encodes the information, and at the same time Codes 
Synthesizing means 14 synthesizes the low frequency components and the 
relation information into simplified image data. 

From the simplified image data thus generated, Low Frequency 
Components (LFC) decoding means 16 extracts low frequency components, 
and at the same time, High Frequency Components (HFC) decoding means 
17 takes out the relation information and decodes the high frequency 
components on the basis of the low frequency components. Original 
Images (OI) output means 18 combines the low frequency components and 
high frequency components and subjects the combination to inverse 
orthogonal transform to restore the original image. 

That permits the handling of simplified image data smaller in size 
than the original image and can produce a restored image close to the 
original image from a view of visual quality. 

The simplified image data can be processed by inputting in means 
such as the personal computer which can restore an image, and also can be 
first stored in storage medium 15. Also, in the simplified image data, the 
data mount of the low frequency components can be further compressed by 
Low Frequency Components (LFC) compression means 300. In this case, 
it is desirable that a lossless compression method should be used. 



10 



It is possible for Reduced Images (RI) generating means 101 to 
extract the frequency area corresponding to a specified size (preview size, 
for example) from the low frequency components and to generate a reduced 
image by performing inverse orthogonal transformation on that 
components. 

In the present invention, furthermore, when a specific image is 
enlarged to a desired size, Shortage Components (ShC) estimating means 
500 estimates the shortage high frequency components on the basis of the 
frequency components of the image as shown in FIG. 5. And EI output 
means 501 combines the frequency components of the specific image and 
the high frequency components obtained by ShC estimating means 500 and 
subjects the combination to inverse orthogonal transform, thereby 
outputting an image enlarged to a desired size. 

According to the present invention, furthermore, in an image 
processing device for processing an original image inputted from image 
input means 10 as shown in FIG 7, OI orthogonal transforming means 11 
subjects the image data to orthogonal transform to generate the frequency 
component of the original image and from this frequency component of the 
original image, Enlarged Frequency (EF) estimating means 800 estimates 
the frequency component at the time when the original image is enlarged 
corresponding to some desired enlargement ratio. On the basis of the 
frequency component of the original image thus obtained and the 
frequency component of the estimated enlarged image, EF estimating 
means 800 extracts the frequency component - as basic component - 
necessary for restoring the specified basic image to a predetermined size, 
and multiple image encoding means 802 works out each relation 
information between the basic component and each frequency component 
corresponding to some estimated enlarged images and encodes the 
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information. The basic component thus obtained and each relation 
information corresponding to some enlargement ratios are synthesized by- 
Multiple Codes (MC) synthesizing means 803 to generate multiple 
simplified image data.. 

That makes it possible to generate multiple simplified image data 
corresponding to a plurality of sizes on the side where an image data is 
encoded. 

From the multiple simplified image data thus generated, the basic 
component and the relation information are extracted, and the image data 
can be restored on the basis of the basic component and the relation 
information. 

In the multiple simplified image data, too, the data size can be 
further reduced by the data size compression of the basic component. 

Secondly, to realize clear and high quality enlarged image data, 
inter-picture element interpolating means 1100 performs interpolation 
between picture elements — according to a desired enlargement ratio - of 
image data inputted from image input means 10 as shown in FIG 10. The 
interpolated enlarged image thus obtained is not sharp in the edge portion, 
but with convolution means 1101 performing a convolutional calculation to 
enhance the edge portion on the interpolated enlarged image, an enlarged 
image with sharp edges is generated without time-consuming frequency 
transform. 

Thirdly, to realize a clear and high-quality enlargement of an 
image, Enlarged Frequency (EF) estimating means 120A estimates the 
frequency components of an enlarged image on the basis of the frequency 
components of the original image obtained by OI orthogonal transforming 
means 11 as shown in FIG. 12. And on the frequency components 
generated by EF estimating means 120A, inverse orthogonal transform 
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means 1213 performs inverse orthogonal transform corresponding to the 
enlargement size and obtains an enlarged image data. 

Estimation of the frequency components by EF estimating means 
120A is performed using a linear approximation or radial basis function 
network. The following method of estimating the high frequency 
components in an enlarged image is also adopted. That is, from the 
original image, an edge image is taken out which it is thought contains 
plenty of high frequency components, and from an enlarged edge image 
obtained by the linear transform, the high frequency components of an 
enlarged image are estimated using orthogonal transform. 

Furthermore, the following method of estimating the frequency 
components of an enlarged edge image with high precision is used. That 
is, on the basis of the frequency components of the edge image taken out, 
the frequency components of the enlarged image is estimated with 
precision using the linear approximation or radial basis function network. 

Next, as shown in FIG. 23, block dividing means 2300 divides an 
inputted image data into a plurality of blocks taking into consideration 
the processing time required for the orthogonal transform. And Enlarged 
Block Images (BI) frequency estimating means 2302 estimates the 
frequency components of the each enlarged image block for all of blocks 
divided from original image. Then, as shown in FIG. 24, the neighboring 
blocks are partly overlapped, and on the overlapped part in the enlarged 
block images, the enlarged block generated later is adopted, thus reducing 
the block artifacts. 

Furthermore, the original image is processed by the transform 
function, thus reducing the discontinuity of an image on the block border - 
the discontinuity caused by division of an image into blocks. 

In case the original image is a color image, Standard Component 
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(SC) selecting means 2800 selects a color component to be a standard from 
among the color components making up an inputted color image and 
generates an enlarged image corresponding to the standard color 
components as shown in FIG. 28. 

And for the remaining color components, Shortage Component 
(ShC) enlarging means 2805 makes an estimation using the transform 
ratio derived by Transform Ratio (TR) deriving means 2801 from a color 
original image to the enlarged image of the standard color components, 
thereby speeding up the processing in generating an enlarged image data 
of a color image data. 

Fourthly, to realize a sharp and high-quality enlargement of an 
image, also, multiple resolution analysis in Wavelet transform is utilized. 

For this, when an original image with n picture elements x m 
picture elements is enlarged to make an enlarged image with Ln picture 
elements x Lm picture elements, Input Images (II) regulating means 12 
interpolates or thins out (hereinafter both expressed as "regulate") to Ln/2 
picture elements x Lm/2 picture elements as shown in FIG. 29. And to 
the image regulated by II regulating means 12, image enlarging means 
290A applies a method based on Wavelet transform and generates an 
enlarged image. 

In the enlargement, to be concrete, the images - an image 
regulated according to the number of picture elements of an enlarged 
image to be obtained, the edge image in the vertical direction, the edge 
image in the horizontal direction and the edge image in the oblique 
direction - are regarded as four sub-band images making up a Wavelet 
transform image. And by performing inverse Wavelet transform on the 
sub-band images, an enlarged image of a desired picture element size is to 
be obtained. 
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Also, in the enlargement, the relation is found between three edge 
images obtained from a 1/4 size reduced image in the low frequency area of 
transform image data, that is, Wavelet transformed original image data 
and the remaining three sub-band images within the transform image data. 
Here, the edge image in the vertical direction, the edge image in the 
horizontal direction, the edge image in the oblique direction of the image 
regulated according to the number of picture elements of an enlarged 
image to be obtained are each corrected using the above relation 
information. 

And the regulated image and the three corrected edge images are 
regarded as four sub-band images making up the transform image data. 
By performing inverse Wavelet transform on the sub-band images, an 
enlarged image of a desired picture element size is obtained. 

In the enlargement, furthermore, the relation is to be found 
between one typical edge image data obtained from the 1/4 size reduced 
image data present in the low frequency area of the transform image data 
and the remaining three sub-band image data within the transform image 
data. Here, one typical edge image obtained from the image regulated 
according to the number of picture elements of an enlarged image data to 
be obtained is corrected using the relation information. 

And the regulated image data and the three image data obtained 
by correcting are regarded as four sub-band images making up the 
transform image data, and inverse Wavelet transform is performed to 
obtain an enlarged image of a desired picture element size. 

Also, as shown in FIG. 37, Enlarging Process (EP) initializing 
means 3700 sets an original image as object image for enlargement, and 
Object Images (Obi) enlarging means 3701 applies a method based on 
Wavelet transform to the object image for enlargement to generate an 
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enlarged image having four times as many picture elements. And 
Multiple Processing (MP) ending judge means 3703 sets the enlarged 
image obtained by Obi enlarging means 3701 as object image for 
enlargement and returns the process to Obi enlarging means 3701. 
Enlarged Images (EI) presenting means 3702 presents visually the 
enlarged image obtained from Obi enlarging means 3701. Furthermore, 
image fine-adjustment means 3704 enlarges or reduces the enlarged image 
presented by El presenting means 3702. 

To be concrete, in case the image size to be enlarged is not known 
now, the above enlargement system using the Wavelet transform is 
adopted. With the enlarged image obtained by inverse Wavelet transform 
as next object image, inverse Wavelet transform is performed. That is, 
enlargement to an image having four times as many picture elements is 
repeated. And when the enlargement reaches a visually desired 
enlargement ratio, the process is suspended. That way, enlargement of 
the inputted original image is performed. 

In case the original image is a color image, in the enlargement 
method using the Wavelet transform, too, a color component as standard is 
selected from among the color components making up the color image, and 
for its standard color components, an enlarged image is generated. 
And the remaining color components are found by performing linear 
transform on the enlarged image of the standard color component using a 
ratio of each remaining color to standard component, thus speeding up the 
processing in generating an enlarged image of a color image. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Having summarized the invention, a detailed description of the 
invention follows with reference being made to the accompanying drawings 
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which form part of the specification, of which* 

FIG. 1 is a block diagram showing the arrangements of first, 
second and third embodiments of the present invention. 

FIG. 2 is a schematic diagram showing the processing steps of the 
first embodiment of the present invention. 

FIG. 3 is a block diagram of the fourth embodiment of the present 
invention including an image processing device. 

FIG. 4 is a schematic diagram showing the procedure for spatial 
prediction and compression used in the four embodiment of the present 
invention. 

FIG. 5 is a block diagram of the fifth embodiment of the present 
invention including an image processing device. 

FIG. 6 is a schematic diagram showing the procedure of means for 
estimating shortage components used in the fifth embodiment of the 
present invention. 

FIG. 7 is a block diagram showing the arrangements of sixth, 
seventh and eighth embodiments of the present invention. 

FIG. 8 is a schematic diagram showing the processing steps of 
means for extracting basic components and multiple image encoding means 
used in the sixth embodiment of the present invention. 

FIG. 9 is a block diagram of the ninth embodiment of the present 
invention including an image processing device. 

FIG. 10 is a block diagram showing the arrangement of a tenth 
embodiment of the present invention. 

FIG. 11 is a schematic diagram showing the procedure of the image 
processing device of the tenth embodiment of the present invention. 

FIG. 12 is a block diagram showing the arrangement of an eleventh 
embodiment of the present invention. 
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FIG. 13 is a diagram schematically showing the processing steps of 
an image processing device of the eleventh embodiment of the present 
invention. 

FIG. 14 is a schematic diagram of the radial basis function 
network. 

FIG. 15 is a diagram schematically showing the relation between 
an original image and an enlarged image in the image processing device of 
the eleventh embodiment of the present invention. 

FIG. 16 is a block diagram showing the arrangements of an image 
processing device of the twelfth embodiment of the present invention. 

FIG. 17 is a diagram schematically showing the processing steps of 
the image processing device of the twelfth embodiment of the present 
invention. 

FIG. 18 is a diagram showing Laplacian filter examples. 

FIG. 19 is a block diagram showing the arrangements of the image 
processing device of the thirteenth embodiment of the present invention. 

FIG. 20 is a diagram schematically showing the processing steps of 
the image processing device of the thirteenth embodiment of the present 
invention. 

FIG. 21 is a diagram schematically showing the processing in case 
linear approximation is used by means for generating edge frequency of 
the thirteenth embodiment of the present invention. 

FIG. 22 is a diagram schematically showing the distortion between 
the enlarged blocks. 

FIG. 23 is a block diagram showing the arrangement of the image 
processing device of a fourteenth embodiment of the present invention. 

FIG. 24 is a diagram schematically showing the processing steps of 
the image processing device of the fourteenth embodiment of the present 
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invention. 

FIG. 25 is a block diagram showing the arrangement of the image 
processing device of a fifteenth embodiment of the present invention. 

FIG. 26 is a diagram schematically showing the processing steps of 
5 the image processing device of the fifteenth embodiment of the present 
invention. 

FIG. 27 is a diagram schematically showing the processing steps by 
kQ means for transforming data within the block of the image processing 

fd device of the fifteenth embodiment of the present invention. 

JglO FIG. 28 is a block diagram showing the arrangements of the image 

processing device of a sixteenth embodiment of the present invention. 
%*. FIG. 29 is a block diagram showing the arrangement of the image 

processing device of a seventeenth embodiment of the present invention. 

FIG. 30 is a diagram showing the filter processing steps in Wavelet 
1^15 transform coefficient. 

FIG. 31 is a diagram schematically explaining the sub-band 
components in a Wavelet transform coefficients. 

FIG. 32 is a diagram schematically showing the estimation of 
shortage sub-band component in a transform image. 
20 FIG. 33 shows filter examples applied for detection of edges in the 

vertical direction, horizontal direction and oblique direction. 

FIG. 34 is a block diagram showing the arrangements of enlarging 
images means making up the image processing device of an eighteenth 
embodiment of the present invention. 
25 FIG. 35 is a diagram schematically showing the estimation and 

correction of shortage sub-band components in a transform image. 

FIG. 36 is a block diagram showing the arrangement of means for 
enlarging images making up an image processing device of a nineteenth 
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embodiment of the present invention. 

FIG. 37 is a block diagram showing the arrangement of the image 
processing device of an twentieth embodiment of the present invention. 

FIG. 38 is a block diagram showing the arrangement of the image 
5 processing device of a twenty-first embodiment of the present invention. 

FIG. 39 is a block diagram showing the arrangement of a 
conventional digital camera. 
^ FIG. 40 is a block diagram showing the arrangement of a 

W conventional image enlarging device in which an image is transformed in 

ffllO the frequency area and enlarged. 

yj FIG. 41 is an explanatory diagram showing an example in which a 

a conventional image is transformed in the frequency area and enlarged. 

% FIG. 42 is a block diagram showing the arrangement of a 

conventional image enlarging device in which an image is enlarged 
2=^15 interpolation between picture elements. 

FIG. 43 is a schematic diagram showing the conventional 

interpolation between picture elements. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
20 Now, the embodiments of the present invention will be described 

with reference to the drawings. 

It is to be understood that hereinafter the unit of coordinate values 

is all identical with the unit of the distance between picture elements. 

Also, hereinafter, the original image that will be described by way of 
25 example is an original image taken in by scanner, digital camera or the 

like. But it is not restrictive. The original image may be a specific 

image or the like held on a magnetic disk etc. or an image or the like sent 

in through communication means like the Internet. 
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Embodiment 1 

First, there will be described the image processing device of 
Embodiment 1 of the present invention. 

In FIG. 1, an original image of a size N picture elements x N 
picture elements obtained by image input means 10 like CCD elements is 
transformed into component data (frequency component data) in frequency 
space. For that, orthogonal transform is used, and among the transform 
method are Hadamard transform and fast Fourier transform (FFT), 
discrete cosine transform (DCT), slant transform, and Haar transform. 
Here in this description, DCT is used. But the present invention is not 
limited to that transform technique but other orthogonal transform 
methods may be applicable. 

It is also noted that DCT used here is two-dimensional DCT 
especially for dealing with images, and the transform formula is given by 
Formula 2. 

F(u,v) = 2c(u)c(v)/^K _xK_y 

K„x K_y 

xV V [D(x, y) cos((2x + l)ujt/2K _ x ) cos((2y + \)vn/2K _ y )] 

' {0*u*K_x-l 0zvzK_y-l } 

c(0) = 1/ V2 
c(K) = 1 (K * 0) 

where F (u, v) represents a DCT component at the component position (u, 
v), and D (x, y) represents image data at picture element position (x, y). 
Furthermore, K _x indicates the number of picture elements in the x 
direction and K _y indicates the number of picture elements in the y 
direction. But here in this embodiment, a square image of N x N is take 
up, that is, K _x = K „y = N. 
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LFC extracting means 100 extracts low frequency components of 

the frequency component data of the original image as shown in FIG. 2 (b). 

HFC encoding means 13 encodes the high frequency area (e) that 

are left when the low frequency components (c) are taken out from the 
5 frequency components of the original image in FIG. 2 (b). Here in this 

embodiment, the area of the high frequency components is divided into 

small blocks HI to H5 as shown in FIG. 2 (e), for example. Likewise, the 
"5 area of low frequency component (c) is divided into blocks LI to L4 as 

f\ shown in FIG. 2 (f). And frequency components within the respective 

^■10 blocks HI to H5 of the high frequency component (e) and the respective 
^ blocks LI to L4 of low frequency components (f) are made related to each 

« other. For example, it may be in the form of linear expression multiplied 

ill by certain constant coefficients as in Formula 3, or the frequency 

fjj components within the respective blocks HI to H5 may be approximated by 

£Tl5 multi-dimensional function ip (LI, L2, L3, L4) with the frequency 

components of the respective blocks LI to L4 as variable. 

Formula 3 

Hi - a • L2 

H2 =L4 
20 H3=$-L3 

H4 =L3 

H5 = y • L2 

Within the respective blocks, furthermore, it is not necessary to 
25 have the same coefficient. Instead, for respective blocks HI to H5, 
coefficient matrix M _C1 to M __C5 are provided, and using them and low 
frequency component matrix M _L1 to M _L4 made up of lfrequency 
components within blocks LI to L4, it is also possible to express high 
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frequency component matrixes M _H1 to M __H5 made up of frequency data 
within blocks HI to H5. In any of the methods, the high frequency data 
size can be reduced by formulating, the remaining high frequency 
components in FIG. 2 (e) expressing the clearness of the image and edge 
information using the low frequency components data in FIG. 2 (c). 

Codes Synthesizing means 14 synthesizes the code string - of the 
relation information between the low frequency components and high 
frequency components obtained from a rule description obtained at HFC 
encoding means 13 or data such as coefficients of the approximation 
expression used by HFC encoding means 13 - and low frequency 
components data of the original image obtained from LFC extracting 
means 100. And the Codes Synthesizing means 14 stores the data on the 
storage medium 15 as simplified image data. Then, to read and restore 
data, a table is provided within the storage medium 15, and information - 
the number of stored image data, the sizes of the respective image data, 
the ratio of low frequency component data in the respective data stored, 
the ratio of codes indicating high frequency components - is stored, thus 
making extraction of the respective data efficient. 

Embodiment 2 

Next, there will be described the image processing device of 
Embodiment 2 of the present invention. 

In FIG. 1, the processing steps by image input means 10 and OI 
orthogonal transforming means 11 for orthogonal transform of original 
images are the same as those in Embodiment 1 and will not be described. 
LFC extracting means 100 extracts low frequency components according to 
the number of picture elements of Reduced Images (RI) display means 102 
that displays thumbnail images. RI generating means 101 performs 
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inverse orthogonal transform on the low frequency components, and 
thumbnail images are displayed on RI display means 102. 

Embodiment 3 

5 Next, there will be described the image processing device of 

Embodiment 3. 

In FIG. 1, the user gives an instruction signal by inputting means 
Q such as a keyboard, mouse etc. (not shown) so as to take out desired data 

CI from simplified image data stored on the storage medium 15 by the image 

ffllO processing device in Embodiment 1 of the present invention and output the 
|ij data on a high resolution laser printer or ink jet printer or to edit the data 

% on CRT. According to this instruction signal, LFC decoding means 16 

first takes out low frequency components as the main data from the 
jL* storage medium 15. And HFC decoding means 17 decodes the low 

ffIS frequency components obtained by LFC decoding means 16 and the high 
frequency components according to relation information between the low 
frequency components and high frequency components within the storage 
medium 15. Then, OI output means 18 combines the low frequency 
component data and high frequency component data and performs inverse 
20 orthogonal transform of the corresponding image size so as to output data 
to be handled by other image processing devices, that is, to be displayed 
on CRT, or to be outputted on the printer or the like. 

Because image data stored on the storage medium 15 are not 
quantized when the data is encoded, the original image thus obtained by 
25 this method is sharper than that by the prior art. 



Embodiment 4 

Next, there will be described the image processing device of 
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Embodiment 4 of the present invention. 

In FIG. 3, the processing steps by 01 orthogonal transforming 
means 11, means 100, HFC encoding means 13, Codes Synthesizing means 
14 and the storage medium 15 is the same as in the image processing 
device of Embodiment 1, and will not be explained. Considering that the 
volume of the low frequency component data of the original image obtained 
by LFC extracting means 100 is large in size and in consideration of the 
restriction by the volume size of the recorded medium etc., the low 
frequency component data is compressed by LFC compression means 300. 

The processing method by LFC compression means 300 is based on 
differential encoding (Differential PCM; DPCM) and entropy encoding 
called the spatial method in JPEG instead of compression technique using 
usual DCT and quantization, entropy encoding (the compression technique 
also called the baseline technique). It is desirable to use the lossless 
compression method that will not cause distortion by compression and 
expansion. 

In this technique, prediction values are worked out by a predictor 
using the neighboring density values, and from the density value to be 
encoded is subtracted its prediction value. For this predictor, seven kinds 
of relational expressions are made available as shown in Formula 4. 



Formula 4 




dDx 


= D1 




dDx 


= D2 




dDx 


= D3 




dDx 


= D3 + (D2 


-Dl) 


dDx 


= D3 + (D2 


- Dl)/2 


dDx 


= D2 + (D3 


- Dl)/2 


dDx 


= (D2 + D3)/2 
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With the density value to be encoded as Dx, the neighboring three 
density values are to be called Dl, D2, D3 as shown in FIG. 4. The 
prediction value dDx to be calculated from those three density values is 
5 defined by Formula 4. Which expression is used is written in header 
information of the compressed image data. In encoding, one prediction 
expression is selected as shown in FIG. 4 (b), and subtracted Ex is worked 
*0 out. This subtracted Ex is entropy encoded. Encoding prediction errors 

fjj makes it possible to compress the low frequency components reversibly. 

j:10 On the basis of the low frequency component data compressed by 

m LFC compression means 300, HFC encoding means 13 formulates the 

^ remaining high frequency components in a way as shown in Formula 3 as 

in Embodiment 1. 

JJJ Codes Synthesizing means 14 synthesizes the compressed 

1^15 frequency component data and the code string of the relation information 
between the low frequency components and high frequency components 
obtained from the rule description obtained by HFC encoding means 13 or 
data such as the coefficient etc. of the approximation expression used by 
HFC encoding means 13, and stores the data on the storage medium 15 as 
20 simplified image data. 

According to this embodiment, it is possible to compress the low 
frequency components to 1/2 to 1/4 which it is thought accounts for the 
largest portion in size in the data size of the device in the Embodiment 1. 
Also, by using a lossless compression method for compression, it is 
25 possible to store data without impairing the frequency components 
possessed by the original image. 

It is noted that since the low frequency component data is a basic 
image to restore high frequency components which express the sharpness 
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in the details of the image, a lossless compression method is applied to the 
low frequency components of the original image to realize the outputting 
of a clearest image with a sharp edge. But the usual JPEG baseline 
method can also be used here. It is a lossy compression, and when it is 
applied the clearness of the image will be lost. But the data size can be 
compressed to 1/10 to 1/20, and it is a method that is used as in case the 
image volume to be held on memory media and to be read is compacted as 
far as possible so that much image data can be stored. 

Also, the relation information between the low frequency 
components and high frequency components obtained from HFC encoding 
means 13 in the image processing device of Embodiment 1 and the relation 
information between the basic components and the high frequency 
components of the respective enlarged images obtained from multiple 
image encoding means 802 in the image processing devices in 
Embodiments 6 and 9 can be compressed by lossless compression methods 
such as the spatial method. 

Embodiment 5 

Next, there will be explained the image processing device of 
Embodiment 5. 

In FIG. 5, the processing steps by storage medium 15, LFC 
decoding means 16 and HFC decoding means 17 is the same as in 
Embodiment 3 and will not be explained. 

ShC estimating means 500 is provided with a function of 
estimating the high frequency component data that will be short in 
enlarging the data to a desired size in case the original image read is 
displayed on high-resolution CRT etc. EI output means 501 combines the 
shortage of high frequency components estimated by ShC estimating 
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means 500 and the frequency components of the original image decoded 
from the encoded image data stored on the storage medium 15 by LFC 
decoding means 16 and HFC decoding means 17, and performs inverse 
orthogonal transform corresponding to the enlargement size, thereby 
5 generating an enlarged image of the original image read. 

ShC estimating means 500 and EI output means 501 process data 
as shown in FIG. 6. First, the frequency component data as shown in FIG. 
6 (b) obtained by performing orthogonal transform of an original image of 
a size N picture elements and N picture elements as shown in FIG. 6 (a) 

10 are embedded in the low frequency area of the frequency components of 
an enlarged image having a coefficient size corresponding to a desired 
enlargement size (sN picture elements x sN picture elements)(FIG. 6 (c)). 
And the shortage components HI to H3 that occur then are estimated from 
the frequency components of the original image shown in FIG. 6 (b). 

15 There are a number of methods of estimation including: 

(1) Method in which nonlinear approximation capacity of the radial basis 
function network (RBFN) is applied. 

(2) Method in which from the frequency components of the edge image of 
the original image, the frequency components of its enlarged edge image 

20 are estimated with precision using linear approximation or radial basis 
function network, whereby the high frequency components of the enlarged 
image which were lacking from the original image are estimated. 

(3) Method in which as disclosed in unexamined Japanese patent 
application 8-294001, the orthogonal transform components of the original 

25 image are embedded in the low frequency area and the frequency 
information obtained on the basis of a prediction rule prepared in advance 
are embedded in the high frequency area. 

(4) Method in which the high frequency components are restored in a 
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process of repeating transform and inverse transform of image datas using 
orthogonal transform as proposed in unexamined Japanese patent 
application 6-54172 (method based on Gerchberg-Papoulis repeating). 
(Methods (l) and (2) will be described later.) 
5 In addition, a number of techniques of enlarging images using the 

frequency area are available. But the ShC estimating means 500 requires 
a technique of estimating with precision of the high frequency components 
necessary to generate a clear image in details with sharp edge. If this 
requirement is satisfied, any technique is applicable. 

10 And EI output means 501 performs inverse orthogonal transform 

corresponding to the coefficient size of sN x sN as shown in FIG. 6 (d), 
whereby the frequency components of the enlarged image estimated by 
ShC estimating means 500 are brought back to real space data, and 
outputted as data to be handled by other image processing devices as to be 

15 displayed on CRT or the like, or to be referred to an outputting unit such 
as the printer. 

Through such arrangements, the high frequency component data 
required in outputting an enlarged image using frequency components of 
the original image are estimated and compensated, thus avoiding a 
20 blurred edge and unclear details which are observed in enlargement by the 
prior art picture element interpolation. 

Embodiment 6 

Next, there will be explained the image processing device of 
25 Embodiment 6 with reference to FIG. 7 and FIG. 8. 

In FIG. 7, the processing by image input means 10 and OI 
orthogonal transforming means 11 is identical with that in Embodiment 1 
of the present invention and will not be explained. First, using the 



29 



frequency components of the original image obtained by OI orthogonal 
transforming means 11, EF estimating means 800 estimates the frequency 
component data in enlarging the image to a plurality of image sizes. The 
estimating method can be the same technique as ShC estimating means 
5 500 of FIG. 5, one of the constituent elements of the image processing 
device in Embodiment 5 of the present invention. 

Then, the basic components are extracted by Basic Components 
O (BC) extracting means 807. This process is schematically shown in FIG. 8. 

yi FIG. 8 (a) shows the frequency components of the original image, FIG. 8 

ffllO (b) shows the frequency components of an image produced by enlarging the 
%U original image twice vertically and twice horizontally, and FIG. 8 (c) shows 

^ the frequency components of an image produced by enlarging the original 

*J? image three times vertically and three times horizontally. More image 

fZ sizes may be provided, and the enlargement ratio of the image size does 

Ol5 not have to be an integer. In FIG. 8 (a), (b) and (c), the frequency 
components in the low frequency area represent the whole-wise feature of 
an image and have a qualitative tendency, which is common to the 
respective enlargement ratios. Therefore, frequency component LOO in a 
low frequency area that exhibits a similar tendency is to be taken as a 
20 basic component as shown in FIG. 8 (d). For simplification of processing, 
the frequency components of the original image shown in FIG. 8 (a) may be 
regarded as basic component. 

Multiple encoding means 802 in FIG. 7 relates the basic component 
LOO and the respective blocks Hll to H13 shown in FIG. 8 (b) and 
25 formulates them (FIG. 8 (e-l)) in the same way as HFC encoding means 13 
does in the image processing devices of Embodiments 1 and 4 of the 
present invention. The respective blocks H21 to H25 shown in FIG. 8 (c) 
also can be related directly to the basic component LOO in the same way. 
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Here in the present embodiment, however, by relating the respective 
blocks Hll to H13 in FIG. 8 (b) and the respective blocks H21 to H25 in 
FIG. (c) as shown in FIG. 8 (e-2), the basic component (frequency 
component in a low frequency area) LOO and the respective blocks H21 to 
5 H25 in FIG. 8 (c) are indirectly related and formulated. That is because 
less qualitative difference is observed between the near blocks in the 
frequency area, the accuracy by the relating procedure. 
fn MC synthesizing means 803 synthesizes the extracted basic 

component and the relation information between the basic component 
jjflO obtained by multiple image encoding means 802 and the respective 
frequency components of the enlarged images and stores data on the 
S p| storage medium 15 as multiple simplified image data in the same way as 

O Codes Synthesizing means 14 does in Embodiments 1 and 4. Then, it is 

§«£ necessary to write the number of enlarged images prepared multiple-wise, 

q15 the starting flag signals of data of the respective sizes etc. on the headers 
or the like of the multiple simplified image data. 

As set forth above, if enlarged frequency components corresponding 
to a plurality of image sizes, matched to the resolution the outputting 
device, are prepared in advance, it is possible to save the time for 
20 estimating the high frequency components that get short in outputting an 
image of a desired size on the basis of data read from the memory medium, 
thus improving the user interface. 

Embodiment 7 

25 Next, there will be explained the image processing device of 

Embodiment 7. 

In FIG. 7, the processing steps by image input means 10, OI 
orthogonal transforming means 11, EF estimating means 800 and BC 
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extracting means 807 are the same as those in the image processing device 
of Embodiment 6 of the present invention and will not be explained. 

Basic Image (BI) generating means 808 perform inverse orthogonal 
transform on the basic component extracted by BC extracting means 807. 
In case the size (coefficient size) in the frequency space of the frequency 
components extracted by BC extracting means 807 is larger than the size 
of the image for review on the display, a frequency component matched to 
the resolution will be further extracted from the low frequency area of the 
basic component and a basic image for display is generated. If, on the 
other hand, the coefficient size of the frequency components extracted by 
BC extracting means 807 is smaller than the size of the image for review 
on the display, "0" component shown in FIG. 41 will be embedded in the 
area where the coefficients of the basic component are short, and a 
thumbnail image is generated. But it is possible to include the following 
processing in BI generating means 808. That is, in such a case, the 
shortage components is estimated according to the image size as EF 
estimating means 800 and the image is enlarged to the image size for 
review. 

Embodiment 8 

Next, there will be explained the image processing device of 
Embodiment 8. 

In FIG. 7, the user can select an image, an object for outputting 
and editing, from among the image data stored on the storage medium 15 
by the image processing device of Embodiment 6 of the present invention 
and also can specify the image size. In this case, the following methods 
can be used. That is, selection information according to the image sizes 
stored on the storage medium 15 is presented, or the user inputs an 
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enlargement ratio with the original image size as basis, or the size is 
automatically selected in accordance with the resolution of equipment to 
which the image is outputted. 

First, there will be described the processing in case selection 
information according to the image sizes stored on the storage medium 15 
is presented. The user selects and specifies a desired image and desired 
image size by inputting means such as the mouse (not shown). From the 
storage medium 15, Basic Components (BC) decoding means 804 takes out 
the basic component of the image from the multiple frequency components 
corresponding to the instruction. Then, Object Frequency (ObF) decoding 
means 805 takes out the expression code string of the high frequency 
component data corresponding to the selected image size, and decodes the 
high frequency components other than the basic component of the object 
enlarged image size from the code string and the basic component. Object 
Images (Obi) output means 806 combines the high frequency components 
and the basic component, performs inverse orthogonal transform and 
outputs the desired enlarged image. 

Next, there will be explained the processing in case the 
enlargement ratio with the original image size as basis is inputted or the 
image size is automatically selected in accordance with the resolution of 
equipment for outputting. In case image sizes are not kept 
multitudinously on the storage medium 15, the image data of the size 
nearest the enlarged size is selected, and processed by BC decoding means 
804, ObF decoding means 805, and then Obi output means 806. Then, it 
is to be understood, in inverse orthogonal transform in Obi output means 
806, inverse orthogonal transform corresponding to the image size nearest 
the size after enlargement is performed. 

At this stage, however, it is also possible to select an image size 
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smaller than the desired image size and the shortage is made up by ShC 
estimating means 500 in Embodiment 5. 

As set forth above, if the low frequency component that can be 
regarded as common to the frequency components of a plurality of enlarged 
sizes is extracted as basic component and the remaining high frequency 
components of the respective enlarged sizes are encoded on the basis of 
this basic component, frequency components of image on a plurality of 
enlarged sizes can be provided. Therefore, when an enlarged image of the 
needed image size is reproduced according to the user's instruction, the 
enlarged image can be reproduced at a high speed without estimating the 
shortage of high frequency components in the desired enlarged size each 
time an instruction is given. 

Embodiment 9 

Next, there will be explained the image processing device of 
Embodiment 9. 

In FIG. 9, the processing steps by image input means 10, 01 
orthogonal transforming means 11, EF estimating means 800, and BC 
extracting means 807 is the same as those in the image processing device 
of Embodiment 6 of the present invention, and will not be explained. 

Basic Components (BC) compression means 1000 compresses the 
basic component extracted by BC extracting means 807. This compression 
step is identical with that by LFC compression means 300 in the image 
processing device of Embodiment 4 of the present invention and will not be 
explained. 

Also, the processing by multiple image encoding means 802, MC 
synthesizing means 803 and the storage medium 15 following that by BC 
compression means 1000 is the same as that in the image processing 
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device of Embodiment 6 of the present invention and will not be explained. 

Through such arrangements, it is possible to reduce image data 
size - the problem that arises when a plurality of image sizes are held. 

Embodiment 10 

Next, there will be explained the image processing device of 
Embodiment 10. 

FIG. 10 is a block diagram showing the arrangement of the device. 

First, interpolating is effected by an interpolating technique shown 
in FIG. 43 between the picture elements on an original image read by 
image input means 10, and inter-picture element interpolating means 1100 
generates an enlarged image. Then, convolution means 1101 repeats 
convolution of the enlarged image for enhancing the picture elements. 

FIG. 11 shows the convolution process. In case the edge is an 
unclear image as shown in FIG. 11 (b), image data on the edge can not be 
extracted, and it is difficult to enlarge the image in the processing step as 
shown in FIG. 42. But according to the present invention, the averaged 
density value can be enhanced without difficulty. It is noted that the 
processing by convolution means 1101 produces the same effect on the 
enlarged image as when the processing by the edge enhancing filter is 
repeated several times. The conventional processing by edge enhancing 
filter has problems that the processing has to be repeated many times, or 
the edge is not enhanced at all unless a suitable filter is selected, but the 
convolution method has no such problem because in convolution means 
1101, convolution is effected by the density value itself. If it is assumed 
that when convolution is performed at point P in FIG. 11 (c) for the K-th 
time with that density value as Dp [K], the density value Dp [K + l] at 
point P by convolution for the (K + l)-th time is defined as in Fof an 
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objecormula 5. This is to find the average value of convolution values 
near that picture element. Here, Q represents the range of an object 
picture element for convolution, U _ Gaso represents the total number of 
picture elements within its Q, and q represents any picture element within 
Q. 

Formula 5 

Dp[k + 1] - (Dp[k] x Dq[k]))/U _ Gaso 

qEQ. 

If it is assumed that the set of picture elements contained in the 
enlarged interpolated image is the whole of the star A, all the picture 
elements within this A will be processed. 

Next, convergence judging means 1102 judges the convergence of 
this convolution processing. It judges whether on the basis of the mean 
value within A of the square errors between the density value Dp[k - l] 
obtained at the (K - l)th convolution and the density value Dp[K] at the 
K-th convolution as in Formula 6 is satisfied with convergence judgment 
value. It judges that the convergence of the density value is over and 
finishes the convolution if the mean value is smaller than a preset 
convergence judgement value Thred, and estimated enlarged image 
outputting means 1312 outputs the image as estimated enlarged image. 
Formula 6 

Thred >(J^(Dp[k]-Dp[k -1]) 2 )/T _Gaso 

p<EA 

In Formula 6, T _Gaso represents the total number of picture elements 
within A. 

As set forth above, the image processing device of present 
embodiment is to enhance an enlarged image through convolutional 
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arithmetic processing of interpolated image data and can realize an 
enlarged image in a handy manner without conducting time-consuming 
frequency transform. Also, it is possible to obtain an enlarged image with 
edge information without difficulty in processing even an original image 
5 with unclear-cut edge which presents a problem when edge information of 
the original image is used to keep an interpolated image from blurring. 

Embodiment 11 

FIG. 12 shows the arrangement of the image processing device of 
10 Embodiment 11. 

Using original image data of n picture elements x n picture 
elements inputted in this device by image input means 10, Approximate 
Coefficient (AC) deriving means 1200 within EF estimating means 120A 
derives - in a method which will be described later - approximation 
15 coefficient vector (weight coefficient vector) Vw = (w __0,w w _N-l) 

(N = n x n) which is used in the nonlinear interpolation method applied in 
estimating the frequency component data of the enlarged image. 

Using weight coefficient vector Vw obtained by AC deriving means 
1200 and the frequency component data of the original image obtained by 
20 OI orthogonal transforming means 11, non-linear estimating means 1201 
in EF estimating means 120A estimates the frequency component data of 
an enlarged image - as will be described later — using the radial basis 
function network. 

Now, there will be explained the operation of this image processing 

25 device. 

Image input means 10 reads out data of an original image of a size 
of n picture elements x n picture elements for enlargement. The original 
image obtained by image input means 10 is processed in the same way as 
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in Embodiment 1 until the original image is transformed into component 
data (frequency component data) in frequency area by 01 orthogonal 
transforming means 11, and the processing until that will not be 
described. 

On the basis of density value D(x, y)(x, y =0, n-l)(FIG. 13 (a)) of 
the original image from image input means 10 and frequency component 
data F(u, v)(u, v=0,..., n-l) (FIG. 13 (b)) from 01 orthogonal transforming 
means 11, EF estimating means 120A estimates frequency component data 
F'(u, v)(u,v=0,... , m-l) (FIG. 13 (c)) of an enlarged image at the time when 
the original image of n x n is enlarged to an m x m image (n < m = s x n). 
This estimation is made using the radial basis function network 
(hereinafter RBFN) which will be described later. 

Not a linear but an non-linear relation is generally applicable 
between the two-dimensional DCT component value and its component 
position (u, v). For this reason, a technique that can approximate this 
non-linear relation with precision is required to produce a clear enlarged 
image without the edge getting blurred. Among such techniques are one 
using hierarchical neural network learned on the basis of learning data 
prepared in advance and a technique of fuzzy logic reasoning using a rule 
base extracted from learning data prepared in advance. For these 
techniques, however, it is indispensable to prepare much image data for 
learning in advance so that estimation can be made with high precision. 

On the other hand, there are techniques that can self-organize 
approximation function using the features of inputted data without 
learning data unlike the above-mentioned techniques. One example of 
these is a technique that makes short-term prediction of time series 
pattern that fluctuates chaotically, and the radial basis function network 
that is used here is considered to be a kind of that technique. Here, the 
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surface drawn by frequency components is approximated by non-linear 
function using RBFN from the frequency components of a given image and 
its corresponding position. 

FIG. 14 is a example model arrangement of RBFN used here. To 
5 this, the component position vector VP = (u, v) to be found is given as 
input, and unified into one at the outputting layer through transform by 
radial basis function (RBF) § at the intermediate layer, and output 
frequency component Fout = F' (u, v) is outputted. This is shown in 
Formula 7 where VP_i = (u_i, v_i) represents the center of the i-th RBF 
10 function and N represents the number of RBF functions. 
Formula 7 

FXu,v) = ^wJ-<p(\p>-VP_i\\) 

{0 <u < m -1,0 <v <m -1} 

If, for example, an original image is enlarged with If requency 
components P as center as shown in FIG. 15, VP_i corresponds to the 

15 position vector of the frequency component of the original image, and the 
number N of the RBF functions corresponds to the number n x n of picture 
elements of the original image. That is, as many RBF functions as the 
number of picture elements of the original image are provided, and 
frequency component data after enlargement placed around component 

20 position vector VP_i at P, the center of the RBF function, is estimated as 
overlapping of these RBF function outputs. It is noted that RBF is a 
function that changes depending on distance II VP - VP_i II between 
component position vector VP and the center VP_i of the i"th RBF function, 
and one example is given in Formula 8. 

25 
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Formula 8 

HY? -VP-i§ = fP -^_/|r +1) log(||^P -KP„z||) (a) 
(f>{fP -VP_i\\) = exp(-|^P -VP_if/b) (b) 

where (a) is a logistic function and k is a parameter to control the slant of 
the function. Also, (b) is a Gaussian function, and like k, b is also a 
parameter expressing the form. In addition, many others are conceivable, 
but since (b) is generally used, (b) is used here, too, with b = 1.0. 

Weight coefficient vector Vw — (w _0, w w _N*l) T (T : 

transposition) is decided on as follows. 

To optimize the non-linear approximation in Formula 8, weight 
coefficient vector Vw should be so decided on that with VP = VP_i, the 
estimated value of the frequency component is in agreement with the 
frequency component value obtained from the original image as far as 
possible. Here, P'_i in the frequency area after enlargement corresponds 
to P_i in the frequency area of the original image as shown in FIG. 15. It 
means that with frequency component position at P _i as (a_i, b_i), Vw 
that minimizes square error function E (Vw) between frequency component 
F(a _i, b_i) at P _i and estimated frequency component F'(u_i, v _i) in P T _i 
should be taken as optimum weight coefficient vector. 

First, estimated components F 1 (u, v) at (u, v) are arranged from 
the low frequency area, and the k~th estimated component corresponding 
to frequency component position (a_i, b_i) of the original image will be 
given as FF(k). Then, matrix MP made up of frequency component vector 
Vy of an estimated enlarged image and RBF function is defined as Formula 
9. 

Formula 9 
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Vy = (FF(p%FF(l), , FF(N -1)) T 

MP = [MPJj] 

^(fP_0~VP_Q\\ 0(|f^P „ 0 - _ ^(j|l^P _ 0 - _ - 1[|) 
^(||^ _ 1 - KP _ 0||), 0(||KP „ 1 - KP „ 0(||KP_l-KP_iV-l||) 



$(\fP _N -l-VP <j><\VP _N -l-VP <p(p>_N-l-VP_N-l\\) 

From this, frequency component vector Vy can be rewritten as 
Formula 10. 
Formula 10 
5 Vy = MP * Vw 

And the evaluation function (which represents square error 
function) E (Vw) is given as in Formula 11. In Formula 11, vector Vf 
indicates that frequency components F (a, b) of an original image are 
arranged from the low frequency area one after another, the number of 
10 arranged components being N = n x n pieces. It is a vector that will be 
settled if an original image is given in advance. 
Formula 11 

E(Vw) = ( p ( a -~^ h - 1 ) ~ F '( u J'V-i)) 2 

i=0 1=0 

- (Vf -MP-Vwf -(Vf -MP-Vw) 
Vf =(F(0,0),F(0,1) ,F(N-1,N-1)) T 

15 Then, if rewritten with attention paid to dE(Vw) / dVw = 0, Vw will 

be given as in Formula 12. 
Formula 12 

Vw = (MPT - MP) iMP ■ Vf 
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Using this, AC deriving means 1200 calculates approximation 
weight coefficient Vw. Using weight coefficient Vw, non-linear estimating 
means 1201 estimates DCT component F'(u, v) at (u, v) from Formula 7. 

In this connection, in case (b) is used in Formula 8, the following 
method of setting a value of b is conceivable. When an original image is 
estimated - using many of image data prepared in advance - from a 
reduced image prepared in the sub-sampling, the relation is derived 
statistically between parameter b for the SN ratio of the estimated image 
and the original image to be a maximum and the distribution of data 
within the respective original images (standard deviation, average, 
dispersion etc.), whereby a value of b is decided on. 

On the basis of the estimation results of the frequency components 
possessed by the enlarged image obtained by RI generating means 101, 
inverse orthogonal transform means 1213 performs inverse discrete cosine 
transform (IDCT), thus restoring the enlarged mage as the value in real 
space. Estimated Enlarged Images (EsFI) output means 1214 - obtained 
by inverse orthogonal transform means 1213 - to be handled by other 
image processing devices, that is, to be displayed on CRT, to be outputted 
on the printer or the like. 

As set forth above, according to the present embodiment, the 
features of the frequency components of an original image can be 
approximated with precision, and it is possible to estimate the high 
frequency components erased in the sampling of the original image in a 
handy manner with high precision without preparing a rule etc. in 
advance. 

Embodiment 12 ! 
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FIG. 16 shows the arrangement of the image processing device of 
Embodiment 12, and there will be described the operation of this image 
processing device. As in Embodiment 11, an original image obtained by 
image input means 10 is transformed into frequency component F(u, v) by 
01 orthogonal transforming means 11. At the same time, edge generating 
means 1600 extracts edges by a Laplacian filter shown in FIG. 18. 

Either of Laplacian filters shown in FIG. 18 (a) or (b) may be used 
for the purpose. Some other filter may also be used. The precision of 
edge extraction can be improved by selecting a filter depending on the 
features of the original image to be handled. 

For example, the Laplacian filter shown in FIG. 18 (a) multiplies 
the density value at object picture element by eight times and subtracts 
the density values of the eight surrounding picture elements. That is, the 
difference between the density value of the picture element in the center 
and the density value of the surrounding picture elements is added to the 
density value of the picture element in the center, whereby the picture 
elements that changes greatly in the density difference from the 
surrounding as in the edge are enhanced. 

Enlarged Edge (EE) approximating means 1601 in Enlarged Edge 
(EEd) estimating means 120B linearly enlarges an edge image (FIG. 17 (c)) 
-obtained by edge generating means 1600 as shown in FIG. 17 - to a 
desired enlargement ratio s, and interpolation picture elements have to be 
embedded between the existing picture elements as the image is enlarged. 
Here in present embodiment, that is realized by embedding the mean 
value of the edge image data derived from the existing picture elements in 
the center between the existing picture elements. For example, in case 
an existing edge image is enlarged to one with picture element positions 
P(x _0, y _0) and Q(x _0+dx, y __0) and it is necessary to interpolate 
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between them in the x direction according to the number of enlarged 
picture elements, the interpolation will be as given by Formula 13 (a). 
Here, it is assumed that data at picture element position P is represented 
by D(x _0, y _0), data at picture element position Q is represented by D(x 
_0+dx, y _0+dy), the position H to be embedded with interpolation data is 
represented by H(x _h, y _h), and interpolation data to be embedded is 
represented by DH(x _h, y _h). In case interpolation has to be done 
between picture element position P(x _0, y _0) and R(x _0, y _0+dy) in the 
y direction, data is as given by Formula 13 (b). In this connection, the 
interpolation data position in (b) is represented by I(x _i, y _i) and the 
interpolation data is represented by DI(x _i, y _i). 
Formula 13 

(a) 

DH(x_h,y _h) = (D(x _0,y _0) + D(x_Q+dx,y _0))/2 
x _h = (x__0 + x_0 + dx) 1 2 
y _h = y 0 

(b) 

DI(x_i,yJ) = (D(x_0,y_0)) + D(x_0,y_0 + dy))/2 
x _i =x_0 

yJ = (x_0 + y_0 + dy)/2 

Furthermore, it is considered that if interpolation is carried out in 
the y direction, too, using two interpolation picture elements neighboring 
each other in the y direction (obtained by interpolation in the x direction), 
the interpolation precision will further improve. 

Edge Frequency (EF) enerating means 1602 in EEd estimating 
means 120B performs orthogonal transform on an estimated image of an 
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enlarged edge (FIG. 17 (d)) obtained by EEd approximating means 1601 to 
find frequency components (FIG. 17 (e)). That is done mainly because of 
the clearness of the image and the features of details of the image and also 
because plenty of high frequency components representing the edge are 
contained in such an edge image. In other words, it is based on the idea 
that because an extracted edge image is lacking in information on the 
other portion, high frequency components appear, but lower frequency 
components will come out at a low level only. And the frequency 
components possessed by the enlarged image are estimated by substituting 
the low frequency area of the frequency components of the enlarged edge 
image (FIG. 17 (e)) obtained by EdF generating means 1602 with the 
frequency component data (FIG. 17 (b)) - obtained by 01 orthogonal 
transforming means 11 - which have the whole features of the original 
image (FIG. 17 (a)). And the enlarged image (FIG. 17 (g)) brought back to 
the real space data through inverse orthogonal transform means 1213 is 
outputted so as to be handled by other image processing devices, that is, to 
be displayed on CRT, to be outputted on the printer or the like. 

That way, the enlarged image can be estimated without difficulty 
without using an RBFN method as in Embodiment 11. Furthermore, the 
edge information is handed over in the original condition, and thus the 
edge can be enhanced without losing the high frequency components of the 
original image and the blurring of the image can be kept down. 

Embodiment 13 

FIG. 19 shows the arrangement of the image processing device of 
Embodiment 19, and there will be described the operation of the image 
processing device. 

DCT transform is applied to an original image (FIG. 20 (a)) 
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obtained by image input means 10 so as to derive frequency component 
data (FIG. 20 (b)), and at the same time an edge image (FIG. 20 (c)) of the 
original image is generated by edge generating means 1600 using a 
Laplacian filter as shown in FIG. 18. Edge Images (EI) orthogonal 
5 transforming means 1900 in EEd means 120B performs DCT on the edge 
image to acquire the frequency component data of the edge image (FIG. 20 
■(d)). 

Q From the frequency component data of this edge image, Edge 

v n Frequency (EdF) estimating means 1901 in EEd means 120B estimates the 

If! 10 frequency component data - to be obtained from the edge portion of an 
Zi enlarged image (FIG. 20 (e)) - using radial basis function network (RBFN) 

adopted in Embodiment 11 which is excellent in nonlinear approximation. 
And the low frequency area of the frequency component data thus 
estimated is substituted with the frequency components obtained from the 
S15 original image (FIG. 20 (f)), thereby acquiring the frequency component 
data of the enlarged image. 

That way, as Embodiment 12, the enlarged edge information 
containing plenty of high frequency component data of the enlarged image 
is estimated from the features of the high frequency components of an 
20 image contained mainly in edge information, and by using that, the high 
frequency components can be compensated well which give clearness to an 
enlarged image. Furthermore, in Embodiment 12, the frequency 
component data of an enlarged edge image is estimated from a simple 
linear-interpolated image of the edge image of an original image, 
25 Interpolation of present embodiment is equivalent to non-linear 
approximation and it is considered, therefore, that the estimation 
precision in present embodiment will be higher than the method of simply 
performing linear interpolation between two samples. 
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This non-linear estimation method is different from that in 
Embodiment 11 only in that the frequency component data of the edge 
image obtained from the original image, and not the frequency component 
data of the original image, is inputted in RBFN of non-linear estimating 
5 means 1201 used in Embodiment 11, and will not be explained. 

As this non-linear estimation method, it is possible to adopt a 
neural network technique using learning data, a technique following a 
fuzzy rule, a chaos deterministic prediction technique etc. as described in 
Embodiment 11. It should be noted that for those techniques, it is 
'%10 necessary to prepare large-scale learning data in advance. 

JS; Furthermore, EdF estimating means 1901 finds the intermediate 

JL value between the DCT components (points n - 3, n - 2, n - 1 in FIG. 21 (a)) 

of an image sampled from the high frequency side according to the 
Us enlargement ratio s as shown in FIG. 21. And the frequency components 

H?15 of the enlarged edge can be estimated by allocating the DCT component 
positions starting from the head side of the high frequency components 
(points n-l+t-3, n-l + t- lin FIG. 21 (b)). That way, the high 
frequency components can be compensated that give clearness to an 
enlarged image. 

20 It appears that technically, there is not much difference between 

the interpolation method shown in FIG. 21 and that in Embodiment 12. 
In Embodiment 12, however, there is a possibility that the features of the 
edge etc. will change depending on between which picture elements of the 
edge image the interpolation picture element should be preferentially 

25 embedded when an enlarged edge image is prepared, and a proper 
interpolation order has to be worked out. But in case interpolation is 
done with the frequency components as shown in FIG. 21, the 
interpolation values of the DCT components should be embedded head side 
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with the high frequency components since the point is to compensate the 
high frequency components. Then, the frequency components of the 
enlarged image should be estimated. 

5 Embodiment 14 

FIG. 23 shows the arrangement of the image processing device of 
Embodiment 14. 

O Many of original images inputted by image input means 10 have 

;jg hundreds of picture elements x hundreds of picture elements. If 

gjilO orthogonal transform as in Embodiments 11, 12 and 13 is applied to that 
iy original image at a time, it will take a vast amount of time. To avoid that, 

the data is usually divided into blocks, each a size of 4 picture elements x 
4 picture elements to 16 picture elements x 16 picture elements, and the 
respective blocks are enlarged to a picture element size according to a 
Q15 desired image enlargement ratio s, and they are put together again. In 
the case of this method, however, the following problem are pointed out. 

FIG. 22 (a) schematically shows that problem. For the purpose of 
simplicity, there will be explained an example where one-dimensional DCT 
is applied to the density value D(x, y_0) of the x-direction picture elements 
20 with y = y _0 in block Ai (shown in solid line). 

There are 0 to n-1 pieces of digital image data (density values) in 
the x direction. To perform one-dimensional DCT on them is synonymous 
with approximating with a linear combination of cosine function cos (i x 
jt/n) (i = 0,...., n-l) having n cycles, assuming that a function expressing n 
25 pieces of image data is a cyclic function having n cycles in the x direction. 
Accordingly, D(x, y_0) can be expressed as in Formula 14. 
Formula 14 
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n-1 

D(x,y 0) = ^a_i'COs(i'X'7v/n) 

a_i represents the i-th DCT component. In such a case, as shown 
in solid line in FIG. 22 (a), it is presupposed that in section [n, 2n - l], the 
5 same data as in section [0, n - l] is repeated, and therefore, image data 
D(n, y_0) with x = n will try to determine a_i so as to be identical with 
2 data D(0, y_0) with x = 0. Therefore, the larger the difference between 

1 data D(n - 1, y_0) with x = n - 1 and data D(n, y_0) - D(0, y_0) with x - n, 

* that is, x - 0, the wider the gap between the enlarged data D embedded in 

I 

J10 the x direction and data D (n, y_0) with x = n of the enlarged data 
„ obtained from the from the next block at x = n to x =2n - 1. As a result, D 

J (x, y_0) will be discontinuous in the x direction near this border. 

I Since two-dimensional DCT is one obtained by expanding 

h one-dimensional DCT in the y direction, that discontinuity will also occur 

15 when the one-dimensional way is expanded to two-dimensional DCT. The 
present embodiment is a process addressing this problem. 

An example will be considered where an original image will be 
divided into (n x n)-sized blocks Ai (i = 0,..., L - l) and each block is 
enlarged to (m x m)-sized blocks Bi (i = 0,..., L - l) thereby enlarging the 
20 original image at an enlargement ratio s = m/n. L corresponds to the 
number of blocks. Block dividing means 2300 in FIG. 23 divides the 
original image into blocks of not a size of n x n but blocks Ai' (i = 0,...,L - 
1) of a size of (n + u) x (n + u) so that the neighboring blocks partly 
overlap each other as shown on the left in FIG. 24. 
25 That is, as shown in FIG. 22 (b), DCT on block AO is performed on 

section [0, n + u] (in this case, it is presupposed that in section [n + u + 1, 
2n + 2u], the same data as in section [0, n + u] will be repeated (shown in 
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dotted line in FIG. 22 (b)). Similarly, DCT on block Al is performed on 
section [n, 2n + u ■ 1], That way, it is possible to keep down the gap in 
density value occurring in the block border when the conventional method 
is used (see frame R in FIG. 22). 

In other words, noise N is caused in the end portion of block AO, 
and the data in this portion of block AO is not adopted but the data in Al 
is used. 

Then, EBI frequency means 2302 enlarges the frequency 
component data of block Ai to frequency component data of ((n + u) x s) x 
((n + u) x s). This enlargement is the same as that in Embodiments 1, 2 
and 3 except that the enlargement is carried out block by block in the 
present embodiment, and will not be explained. The block size n x n that 
is used is generally 4 picture elements x 4 picture elements to 16 picture 
elements x 16 picture elements. If n = 8, u that determines the 
overlapping portion is set at 3, but this is not restrictive. 

Block frequency extracting means 2303 does not use all the 
frequency component data of block Bi' obtained but takes out data of the 
required size m x m from the low frequency side, and again processes it to 
make an enlarged block Ci. And on Ci, block inverse orthogonal 
transform means 2304 effects orthogonal transform, and Enlarged Images 
(EI) recomposing means 2305 places the image data generated from block 
Ci at the corresponding position, thus finally obtaining an enlarged image. 

That way, it is possible to avoid discontinuity between the 
interpolation data of block Ci near the border of the neighboring blocks 
and the image data at the head of the next block Ci + 1. 

Embodiment 15 

FIG. 25 shows the arrangement of the image processing device of 
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Embodiment 15. In FIG. 25 ? Block (B) data transforming means 2500 
performs a specific function transform - which will be described later - on 
the image data within blocks Ai (i = 0,..., L - l) of an image size of n x n 
into which the original image is divided by block dividing means 2300 so 
5 as to keep down distortion occurring in the block image after enlargement. 
Other than that, the present embodiment is identical with Embodiment 14 
in arrangement except that the present embodiment drops B frequency 
f*> extracting means 2303 adopted in the image processing device in 

Embodiment 14. 

2:10 In the image processing device according to Embodiment 14, the 

original image is divided by block dividing means 2300 into blocks of a 
m little larger size than the size of n x n so that the respective blocks 

O overlap, and frequency component data of a desired size is taken out of an 

y* enlarged block obtained and substituted again with the frequency data of 

q15 the enlarged block whose original size is n x n. In the present 
embodiment, on the other hand, block dividing means 2300 divides the 
original image into blocks Ai (i = 0, L - 1) of a size of n x n without 
having them overlap as shown in FIG. 26. And data within block is 
transformed as will be described later, and enlarged blocks Bi (i = 0,..., L - 
20 1) are estimated from the transformed image data. That is the point 

where the present embodiment is different. 

There will be described the operation of the image processing 
device thus constructed of Embodiment 15. An original image is divided 
into blocks, and image data in the respective blocks Ai (i = 0, L - 1) are 
25 transformed by B data transforming means 2500 as follows. 

FIG. 27 (a) shows the transition of density value D(x, y) in blocks 
Ai. And the diagram of Fig.27(b),(c)and(d) show the transition D(x, y_0) 
in the x direction at y = y_0. 
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First, function p(x, y_0) that passes through image data D(0, y_0) 
at x = 0 and image data D(n - 1, y_0) at x = n - 1 will be defined. There 
are many function examples, but here a one-dimensional formula as in 
Formula 15 will be defined (FIG. 27 (b)). 
5 Formula 15 

P&, 7-0) = D(0, y_0) + (D(n - 1, y_0) - D(0 9 y_0))/n • x 

'ff. This p(x, y_0) will be subtracted from D(x, y_0). That is, 0(x, y_0) 

|JlO = D(x, y_0) - p(x, y_0) will be calculated (FIG. 27(c)). This calculation is 
W made for each y_0 within the block. In the y direction with x = x _0 fixed, 

furthermore, a similar subtraction function <&(x„0, y) = D(x_0, y) - p(x_0, 
%0 y) is worked out. And image data is normalized with the maximum value 

yj Max at (x = 0,..., n ■ 1, y = 0,..., n - l) of absolute values of those 

rtl5 subtraction function (FIG. 27 (d)). 

Because data at this moment, seen in the x direction, is made such 
that data at the block border x = n - 1 and x = 0, that is, x = n are all 0 
and all of density values between n ■ 1 and 0, the difference between 
interpolation data to be embedded after x = n - 1 and data at x = n is very 
20 small. Thus, the distortion by block connection can be kept down. 

Here in this example, a one-dimensional expression as Formula 15 
is used. But generally any functions that satisfy the border conditions at 
points G and H in FIG. 27 are no problems, and therefore if a function that 
can keep down fluctuation after transform is used for p(x, y_0), distortion 
25 can be furthermore suppressed and the enlarged frequency components can 
be estimated with precision. 

As set forth above, by bringing close to the density value at the 
block border for every block of the original image data, it is possible to 
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reduce the discontinuity of the block border that causes fluctuation in 
estimating the enlarged image performed by interpolation on orthogonal 
transform. 

Embodiment 16 

FIG. 28 shows the arrangement of the image processing device 
according to Embodiment 16. 

In estimating an enlarged image of a color original image, the 
problem is that enlargement will be a vast amount of processing as 
compared with the enlargement of one-color multiple gradation data. The 
present embodiment is an invention to make the processing efficient. 

In a color original image inputted by image input means 10, SC 
selecting means 2800 selects a color component to be made a standard. 
Generally, the color original image is made up of three colors — red, green 
and blue. Considering that green data is much reflected on luminance 
information, it is desirable to select a green component as standard 
component. TR depriving means 2801 finds simple ratio ratio_r of red 
component to green component and simple ratio ratio _b of blue component 
to the green component. There are a variety of methods of finding the 
simple ratio. To be used here in this example are the mean value of 
density ratios to green of red within the object area and the mean value of 
density ratios to green of blue within the object area as in Formula 16. 

Formula 16 



n-l n-l 




n-l n-l 




i=0 ~ 
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In Formula 16, r_ij, g__ij, b_ij represent the densities of the red, 
green and blue components respectively at picture element position (i, j) of 
an original image. Instead of estimating all the remaining components of 
the object area using one ratio coefficient like that, it is also possible to 
5 adopt matrix R_r made up of the ratio of the red component to the green 
component in each picture element and matrix R_b made up of the ratio of 
the blue component to the green component in each picture element. This 
way, it is possible to reproduce the features of the color original image 
better and enlarge the color image with higher precision than when using 

10 one ratio coefficient. 

On this standard component or the green component, as in 
Embodiments 11 to 15, enlargement is performed by Standard Images (SI) 
orthogonal transforming means 2802 transform, Standard Enlarged Image 
Frequency (SEIF) estimating means 2803 and Standard Inverse (SI) 

15 orthogonal transforming means 2804. And ShC enlarging means 2805 
multiplies the simple ratio ratio_r, ratio_Jb by the enlarged green data 
from SI orthogonal transforming means 2804, thus producing enlarged 
data of the red and blue components. These three enlarged components 
are combined into one to produce an enlarged image of the color original 

20 image, and EsEI output means 1214 outputs the data to be handled by 
other image processing devices, that is, to be displayed on CRT, to be 
outputted on the printer or the like. 

By perform orthogonal transform of one component alone, it is 
possible to save the trouble of enlarging each of a plurality of components 

25 making up a color original image and to simplify the processing. 

Embodiment 17 

In Embodiments 17 to 21, there will be explained about image 
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processing devices to enlarge an image using Wavelet transform. 

FIG. 29 shows the arrangement of the image processing device of 
Embodiment 17. Image input means 10 is the same as those in 
Embodiment 1, Embodiment 11 and others. The II regulating means 12 
5 interpolates or thins out (hereinafter both expressed as "regulate") the 
horizontal, vertical picture elements of an original image having n picture 
elements x n picture elements obtained by image input means 10 to 1/2 of 
a desired enlarged image size of Ln picture elements x Ln picture elements. 
Image enlarging means 290A enlarges the image using multiple resolution 

10 analysis in Wavelet transform which will be described later. Enlarged 
Images (EI) regulating means 2913 regulates to a desired image size of Ln 
picture elements x Ln picture elements an enlarged image which was 
regulated by II regulating means 12 and having four times as many 
picture elements as the original image. EsEI output means 1214 outputs 

15 an image data after enlargement - estimated by EI regulating means 2913 

- to other devices as for display. 

Image enlarging means 290A is provided with Vertical Edge (VdE) 
generating means 2900 which takes out an vertical-direction edge 
component image from an original image regulated to Ln/2 picture 

20 elements x Ln/2 picture elements by II regulating means 12, Horizontal 
Edge (HEd) generating means 2901 which takes out an 
horizontal-direction edge component image and Oblique Edge (OE) 
generating means 2902 that takes out an oblique-direction edge component 
image. Furthermore, image enlarging means 290A has leveling up means 

25 2903 which generates an enlarged image of Ln picture elements x Ln 

picture elements by inverse Wavelet transform by regarding the 
above-mentioned three edge component images and an original image 
regulated to Ln/2 picture elements x Ln/2 picture elements regulated by II 
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regulating means 12 as four sub-band components making up an 
transformed image at the time when the enlarged image of Ln picture 
elements x Ln picture elements was subjected to Wavelet transform. 

In the thus formed image processing device of Embodiment 17, 
multiple resolution analysis by Wavelet transform is utilized for image 
enlargement. Wavelet transform, which is described in a number of 
publications including "Wavelet Beginners Guide," Susumu Sakakibara, 
Tokyo Electric Engineering College Publication Bureau, is developed and 
applied in many fields such as signal processing and compression of image 
data. 

An transformed image of an original image, which is obtained by 
subjecting the original image to Wavelet transform, is formed of a number 
of sub-band (partial frequency band) components. FIG. 31 shows a layout 
example of sub-bands of a Wavelet transformed image in which an original 
image is divided into 10 sub-bands, LL3, HL3, LH3, HH3, HL2, LH2, HH2, 
HL1, LH1, HL1. 

FIG. 30 is a diagram in which Wavelet transform as shown in FIG. 
31 is illustrated in the form of filter series. That is, the Wavelet 
transform is performed in three stages — stages I, II, III. In each stage, 
"Low" processing and "High" processing are performed in the vertical 
direction (y direction) and horizontal direction (x direction) separately. 
In the Low processing, low pass filtering and down-sampling (thinning 
out) to 1/2 are carried out. In the High processing, high pass filtering 
and down-sampling to 1/2 are conducted. 

First, High processing and Low processing are performed on the 
original image in the horizontal direction. And on the output of the 
horizontal direction High processing, High processing and Low processing 
are performed in the vertical direction. The result of the High processing 
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the vertical direction is the HH1 component, and the result of the Low 
processing in the vertical direction is HL1. Then, on the output of the 
horizontal-direction Low processing, Low processing and High processing 
are performed in the vertical direction. The result of the 
5 vertical-direction Low processing is LL1 and the result of the 
vertical-direction High processing is LH1. Those are the results obtained 
in the first Wavelet transforming. 

Further, Low and High processing in the horizontal direction is 
applied to the LL1 component. And to the output of the horizontal 

10 direction High processing is applied in the vertical direction, and the 
result is HH2. To the output, Low processing is also applied in the 
vertical direction, and the result is HL2. Further, to the output obtained 
by the Low processing in the horizontal direction, Low processing is 
applied in the vertical direction, and the result is LL2. To the output, 

15 High processing is also applied in the vertical direction, and the result is 
LH2. Those are the results obtained in the second Wavelet transforming. 

Likewise, the LL2 is subjected to horizontal-direction Low 
processing and High processing separately. In the vertical direction, too, 
Low processing and High processing are performed separately. Thus 

20 obtained are the sub-band components HH3, HL3, LH3, LL3. Those are 
the results obtained in the third Wavelet transforming. 

In the first stage of the Wavelet transform as shown, the original 
image is broken down into four frequency components - LL1, HL1, LHl 
and HH1 and down-sampled to 1/2 both in the horizontal and vertical 

25 directions. Therefore, the size of the image representing the respective 
components will be 1/4 of that of the original image. LL1 is a low 
frequency component extracted from the original image and is a blurred 
image of the original image. Most of information of the original image is 
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contained in that component. Therefore, LL1 is the object for the second 
Wavelet transform. 

Meanwhile, the HL1 component obtained by the processing in FIG. 
30, represents an image with the high frequency component extracted 
5 intensively in the horizontal direction of the original image. The LH1 
component represents an image with the high frequency component 
extracted intensively in the vertical direction of the original image. And 
HH1 represents an image with the high frequency component extracted 
both in the horizontal and vertical directions. In other words, it can be 

10 considered to be an image with the high frequency component extracted in 
the oblique direction. 

Viewed from the density value, it is considered that the HL1 
component strongly reflects the area where the density value fluctuates 
violently in the horizontal direction of the original image (edge 

15 information in the vertical direction). On the other hand, LH1 component 
strongly reflects the area where the density value fluctuates violently in 
the vertical direction of the original image (edge information in the 
horizontal direction). The HH1 component strongly reflects the area 
where the density value fluctuates violently in the horizontal and vertical 

20 directions of the original image (edge information in the oblique direction). 

Such characteristics produced by Wavelet transform can also be 
said of the components LL2, HL2, LH2, HH2 obtained in the second stage 
of Wavelet transform of LL1. The same is applicable to the components of 
Wavelet transform with LL2 as object image. Like this, it can be taken 

25 that the Wavelet transform breaks the LL image with the low frequency 
component extracted in the sub-band component image of one stage before 
down into four 1/4 resolution images corresponding to the low frequency 
component and the frequency components in vertical, horizontal and 



58 



? > 

oblique directions. 

And these sub-band component images can be synthesized by 
filtering to restore an image of one stage before. This will be explained in 
FIG. 31. Synthesizing four sub-band component images LL3, HL3, LH3, 
5 HH3 can restore LL2, and synthesizing LL2, HL2, LH2 and HH2 restores 
LL1. And the original image can be restored by using LL1, HL1, LH1, 
HH1. 

Because such a Wavelet transform can express a plurality of 
sub-band component images with different resolutions simultaneously, it 
10 is also called the multiple resolution analysis. And the Wavelet 
transform attracts attention as technique that can compress data 
efficiently by compressing the respective sub-band components. 

In the present embodiment, first, the original image is regarded as 
the low frequency sub-band component LL1 at one stage before. The next 
15 step is to estimate the remaining images — image HL1 with the high 
frequency component strongly extracted in the horizontal direction, image 
LH1 with the high frequency component strongly extracted in the vertical 
direction, and image HH1 with the high frequency component strongly 
extracted in the horizontal and vertical directions, and to obtain an 
20 enlarged image four as large. And this processing is applied to enlarge 
an original image to a desired size. 

On the basis of the above description, the operation of the image 
processing device of the present embodiment will be explained. 

First, image input means 10 reads out an original image of a size n 
25 picture elements x n picture elements to be enlarged. 

The original image read by image input means 10 is regulated in 
image size by II regulating means 12. As mentioned in the description of 
the multiple resolution analysis in Wavelet transform, if an image, an 
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object for transform, is subjected to a Wavelet transform, the sub-band 
components after transform will always become 1/2 of the original size 
both in the number of picture elements in the horizontal direction and the 
number of picture elements in the vertical direction. On the other hand, 
5 the picture element size obtained by inverse Wavelet transform will be 
twice as large as that of the original sub-band component both in the 
number of picture elements in the horizontal direction and the number of 
^ picture elements in the vertical direction. That is, the total number of 

picture elements will be four times as many. 
^flO In view of such a nature, it is desirable that the number of picture 

4S elements both in the horizontal and vertical directions of an enlarged 

rp image obtained in an inverse transform is a multiple of 2. For this reason, 

O II regulating means 12 first regulates a desired enlarged image size Ln 

1^ picture elements x Ln picture elements - the enlargement ratio being L — 

J515 to a multiple of 2, that is, dLn picture elements x dLn picture elements 
^ and regulates the original image so that its size is dLn/2 picture elements 

x dLn/2 picture elements. A number of regulating techniques are 
available, but here in this embodiment it is to be realized by interpolating 
between the picture elements using Formula 1 or by thinning out picture 
20 elements in areas where the gradation less change. 

Here, the following methods are also possible to apply. That is, 
the original image is transformed into frequency space by orthogonal 
transform such as DCT transform, and the frequency components 
corresponding to dLn/2 picture elements x dLn/2 picture elements are 
25 taken out. Or the shortage high frequency components are embedded 
with "0" and regulated by inverse orthogonal transform corresponding to 
dLn/2 picture elements x dLn/2 picture elements. But considering the 
processing efficiency and enlargement by Wavelet transform technique 
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after that, it is not thought that a complicated processing is so efficient. 
For this reason, simple-interpolating between picture elements or thinning 
out is adopted. 

Image enlarging means 290A enlarges an original image - 
regulated to dLn/2 picture elements x dLn/2 picture elements by II 
regulating means 12 - twice both in the horizontal and vertical directions 
to an image size close to a desired Ln picture elements x Ln picture 
elements. For this, the multiple resolution analysis in the Wavelet 
transform is utilized. The problems with the prior art method in which 
orthogonal transform is performed and shortage component is compensated 
in the frequency area are processing time and occurrence of jaggy noise in 
block joint because an image is divided into a plurality of blocks. But the 
Wavelet transform offers an advantage that because a large image can be 
handled at a time, no such noise will be caused. 

Furthermore, in case sub-band component LLl in FIG. 31 is taken 
as image regulated to dLn/2 picture elements x dLn/2 picture elements, 
image enlarging means 290A will have to estimate the images of dLn/2 
picture elements x dLn/2 picture elements corresponding to the remaining 
three sub-bands HL1, LH1, HH1. 

FIG. 32 schematically shows that procedure. Here, a method is 
adopted in which the three images of sub-band components HL1, LH1, 
HH1 are taken as edge images of LLl in three directions. As mentioned 
above, the component HL1 is to represent an image with the high 
frequency component strongly extracted in the horizontal direction of an 
enlarged image (will be named LLO) having four times as many picture 
elements as the original image regulated to dLn/2 picture elements x 
dLn/2 picture elements (FIG. 31 (a)), and the sub-band component LHl is 
to represent an image with the high frequency component strongly 



61 



extracted in the vertical direction of sub-band component LLO. And the 
sub-band component HH1 will be an image with the high frequency 
extracted both in the horizontal and vertical directions. That is, it is 
considered, the sub-band component HLl reflects the area representing a 
high frequency component in the horizontal direction, that is, edge 
information in the vertical direction of the image of sub-band component 
LLO (FIG. 32 (b)). On the other hand, it is considered, the sub-band 
component LH1 reflects the area representing a high frequency component 
in the vertical direction, that is, edge information in the horizontal 
direction of the image of sub-band component LLO (FIG. 32 (c)). And it is 
considered, the sub-band component HH1 reflects the area representing a 
high frequency component both in the vertical and horizontal directions, 
that is, edge information in the oblique direction of the image of sub-band 
component LLO (FIG. 32 (d)). 

In the present embodiment, therefore, there is provided edge 
generating means 290B of an arrangement as shown in FIG. 29 where VdE 
generating means 2900 extracts the edge component in the vertical 
direction of an original image regulated to dLn/2 picture elements x dLn/2 
picture elements by II regulating means 12 and takes it as shortage HLl 
component. And HEd generating means 2901 extracts the edge 
component in the horizontal direction of the original image regulated to 
dLn/2 picture elements x dLn/2 picture elements by II regulating means 12 
and takes it as shortage LH1 component. Similarly, OEd generating 
means 2902 extracts the edge component in the oblique direction of the 
original image regulated to dLn/2 picture elements x dLn/2 picture 
elements by II regulating means 12 and takes it as of shortage HH1 
component. 

In this process, it is to be assumed that the edge generating means 
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2900, 2901 and 1902 use edge detection filters to perform detection in 
three directions as shown in FIG. 33. An example shown in FIG. 33 (a) is 
to detect the edge in the horizontal direction using a filter in which 
weighting increases in the horizontal direction. An example shown in 
FIG. 33 (b) is to detect the edge in the vertical direction using a filter in 
which weighting increases in the vertical direction. An example shown in 
FIG. 33 (c) is to detect the edge in the oblique direction using a filter in 
which weighting increases in the oblique direction. It is not that these 
filters only are applicable, but other filters may be used. 

With the estimated four sub-band components subjected to inverse 
Wavelet transform as described, leveling up means 2903 acquires a clear 
enlarged image of a size dLn picture elements x dLn picture elements. In 
this case, since the Wavelet transform can be illustrated in filter series 
processing as shown in FIG. 30, the processing here may be filter series 
processing that does the reverse of that. 

If the size of an enlarged image obtained by image enlarging means 
290A, that is, the desired image size Ln picture elements x Ln picture 
elements is not a multiple of 2 (in case Ln/2 is not an integer), a delicate 
difference occurs. In such a case, EI regulating means 2913 does 
interpolation between picture elements or thinning out to make up for the 
delicate difference. To be processed here is at most one picture element, 
and therefore the processing is done in the area where the image changes 
are small (areas - other than the edge - where there is small gradation 
change) and its effect is small. 

An enlarged image obtained by EI regulating means 2913 is 
handed over to other devices by Enlarged Images (EI) output means 2914 
and displayed on CRT or used in some other way. 

According to the present embodiment, as set forth above, the 
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blurring of an image can be kept down unlike the prior art method of 
merely interpolating between the picture elements in the original image 
and the prior art device in which the shortage of frequency components in 
the frequency area are embedded with "0." The present embodiment also 
produces a clear enlarged image without causing noise like jaggy etc. 
which is considered a problem encountered with the orthogonal transform 
method. According to the present embodiment, furthermore, it is possible 
to estimate an enlarged image in a handy manner without preparing a rule 
in advance. 

Embodiment 18 

FIG. 34 is a block diagram showing the arrangement of image 
enlarging means 290A making up the image processing device of an 
eighteenth embodiment 18 of the present invention. The operation of this 
device will be explained. 

As in Embodiment 17, an original image obtained by image input 
means 10 is regulated by II regulating means 12 from a desired enlarged 
image size Ln picture elements x Ln picture elements to dLn/2 picture 
elements x dLn/2 picture elements, both a multiple of 2, in the horizontal 
and vertical directions. And the original image is regulated to a 1/4 size, 
that is, dLn/2 picture elements x dLn/2 picture elements both in the 
horizontal and vertical directions. 

Receiving the results of the processing by II regulating means 12, 
input fine-adjustment means 700 fine-adjusts dLn/2 picture elements x 
dLn/2 picture elements by one picture element to a multiple of 2, that is, 
ddLn picture elements x ddLn picture elements so that the sub-band 
component one level below is acquired at leveling down means 701 from 
the original image regulated to dLn/2 picture elements x dLn/2 picture 
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elements. And leveling down means 701 performs Wavelet transform on 
the original image of ddLn picture elements x ddLn picture elements 
acquired at input fine-adjustment means 700 and generates four sub-band 
components LL2, HL2, LH2, HH2 with the image size being 1/4 of that of 
the original image. 

FIG. 35 schematically shows the outline of the processing at image 
enlarging means 290A. 

In Embodiment 17, the edge image in the vertical direction, the 
edge image in the horizontal direction and the edge image in the oblique 
direction, which were obtained from a current object image LL1, are taken 
as the sub-band components HL1, LH1, HH1 that are short in the Wavelet 
transformed image of the image LLO obtained by enlarging the current 
object image by four times. But strictly speaking, this is not applicable to 
filtering in FIG. 30. 

For example, in case sub-band component HL1 is acquired from 
sub-band component LLO by filtering, high frequency component data in 
the horizontal direction and low frequency component data in the vertical 
direction are extracted as HL1 component by filtering in FIG. 30. 
Because of that, in the HL1 component, there are extracted a picture 
element portion where the value fluctuates violently in the horizontal 
direction (edge etc. extending in the vertical direction) and a picture 
element portion where the value fluctuation is small in the vertical 
direction. In Embodiment 17, it is thought that of those portions, the 
picture element portion where the value change is great in the horizontal 
direction, that is, the edge portion extends in the vertical direction has a 
great effect, and edge information in the vertical direction only is taken as 
HL1 component. In some images handled, there are cases where the 
effect of the image portion in which the value change is small in the 
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vertical direction can not be ignored. Furthermore, edge information in 
the vertical direction contains plenty of the picture element portion where 
the value change is great in the horizontal direction but, strictly speaking, 
that is not always true. This is also applicable to other sub-band 
5 components LH1, HH1. 

In consideration of that, in the present embodiment, it is decided 
to prepare sub-band components LL2, HL2, LH2, HH2 by 
Wavelet-transforming the current object image and thus lowering the 
* components by one level. And the correction amounts dHL, dLH, dHH for 

IJJIO estimating the sub-band components corresponding to the three edge 

«j5 images of the original object image are found from the correlation among 

||1 the edge images HLe, LHe, HHe in three direction of the low frequency 

q component LL2 in three directions in those sub-band components and the 

{7 actual three sub-band components HL2, LH2, HH2. 

2-15 First, there is provided an arrangement formed of Reference 

^ Components (RC) generating means 70A, correction estimating means 70B, 

and component estimating means 70C as shown in FIG. 34. And means 
702 for generating reference HL component detects edge information in the 
vertical direction using a filter as shown in FIG. 33 (b) with attention paid 
20 to the LL2 component that is present in the low frequency area and 
expresses the features of the original image more suitable than the 
sub-band components LL2, HL2, HH2. That edge information will be 
named reference HL component HLe. HL correction estimating means 
705 checks the correlation between the reference HL component HLe and 
25 HL2 obtained by leveling down means 701. 

There are a number of methods for finding that correlation. Here, 
difference image dHL between reference HL component HLe and actual HL 
component HL2, that is, dHL = HLe - HL2, will be found as shown in FIG. 
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35 (c). Means 703 for generating reference LH component and means 704 
for generating reference HH component, too, select edge information in the 
horizontal direction of the LL2 component as reference LH component LHe 
and edge information in the oblique direction of the LL2 component as 
5 reference HH component. LH correction estimating means 706 finds the 
difference image dLH = LHe - LH2 between reference LH component LHe 
and actual LH component LH2. HH correction estimating means 707 
finds the difference dHH = HHe - HH2 between the reference HH 
component HHe and actual HH component HH2. 

10 And as in Embodiment 17, HL component estimating means 708, 

LH component estimating means 709 and HH component estimating means 
710 firstly enlarge the image of correction components dHL, dLH, and 
dHH to the components with ddLn picture elements x ddLn picture 
elements. Next, means 708, 709 and 710 subtract the correction 

15 components dHL, dLH, and dHH from above-mentioned edge components of 
HL1, LH1, HH1 obtained by VEd generating means 2900, HEd generating 
means 2901 and OEd generating means 2902 respectively, and estimates 
HL1, LH1, HH1 components at the time when the original image regulated 
to ddLn picture elements x ddLn picture elements by input 

20 fine-adjustment means 700 is taken as sub-band component LL1. HL 
component estimating means 708, LH component estimating means 709 
and HH component estimating means 710 do fine-adjustment by 
interpolating between the respective picture elements in accordance with 
Formula 1 so that the picture element size of each corrected image will be 

25 ddLn picture elements x ddLn picture elements when the above-mentioned 
correction components dHL, dLH, dHH are used. But this is not the only 
way, but other methods are possible to apply including a conventional 
method of enlarging the image size twice both in the horizontal direction 
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and the vertical direction by embedding the shortage component with 0 in 
the area where the frequency is transformed. 

But it takes long to embed 0 in shortage areas and to perform 
enlargement by inverse Wavelet transform using the sub-band component. 
5 In consideration of processing efficiency, simple interpolation is better and 
could have little effect on the picture quality of the finished enlarged 
image. For this reason, HL component estimating means 708, LH 
component estimating means 709 and HH component estimating means 
710 are to adopt a linear interpolation method as in Formula 1. 

10 Processing step by leveling up means 2903 and after that is the 

same as in Embodiment 17. 

In the above description, estimation by HL component estimating 
means 708, LH component estimating means 709 and HH component 
estimating means 710 is made by adding the difference components 

15 between reference components and actual components. In addition to 
that, the following methods are well suitable. 

(1) The correction amounts obtained at HL correction estimating 
means 705, LH correction estimating means 706 and HH correction 
estimating means 707 are multiplied by a transform coefficient matrix. 

20 The products are added to the component values of VdE generating means 
2900, HEd generating means 2901, and OEd generating means 2902 
respectively. 

(2) The correction amounts obtained by HL correction estimating 
means 705, LH correction estimating means 706 and HH correction 

25 estimating means 707 are transformed by the transform function. The 
results are added to the results of VdE generating means 2900, HEd 
generating means 2901 and OEd generating means 2902 respectively. 

(3) A neural network model is used which has so learned as to 
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input in the neutral network the results of HL correction estimating 
means 705, LH correction estimating means 706, HH correction estimating 
means 707, and VdE generating means 2900, HEd generating means 2901, 
and OEd generating means 2902 and to output the estimated values of HL, 
5 LH, HH. 

(4) The HL, LH, HH components are estimated from a large amount 
of data base or rule base prepared in advance by inputting the results of 
^ HL correction estimating means 705, LH correction estimating means 706, 

y3 HH correction estimating means 707, and VdE generating means 2900, 

yyiO HEd generating means 2901, and OEd generating means 2902. 
JS According to the present embodiment, shortage sub-band 

On components, especially high frequency components in Wavelet transform 

Q images, — which can not be taken out merely by edge detection in three 

rT directions of a regulated original image in Embodiment 17 - can be 

J2J15 estimated with high precision and thus the blurring of an image can be 

r " kept down. Furthermore, Wavelet transform does not require block 

division as in orthogonal transform and there arises no block distortion, 
which is the problem encountered with the prior art method using 
orthogonal transform. 

20 

Embodiment 19 

FIG. 36 shows the arrangement of image enlarging means 290A of 
the image processing device of Embodiment 19, and there will be described 
the operation of this image processing device. 
25 The processing steps by image input means 10 and II regulating 

means 12 is the same as that in Embodiment 17, and the processing by 
input fine-adjustment means 700 is also identical with that in 
Embodiment 18. 
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Reference Components (RC) generating means 3601 finds reference 
components to determine the respective correction amounts which are used 
in estimation - of HL, LH, HH components - from LL2 in the low 
frequency area in a sub-band component image obtained by leveling down 
means 701 - the estimation made by HL component estimating means 708, 
LH component estimating means 709 and HH estimating means 710. 
Here, a Laplacian filter as shown in FIG. 18 (a)(b) is used, and the typical 
edge image of LL2 will be taken as the reference component image. The 
Laplacian filter is often used for detection of edges where not so much 
restriction is imposed on the direction, and not edges in a specific 
direction, as explained in Fig 33. 

With this edge as reference component image, the present 
embodiment finds the correction amount as in Embodiment 18. This way, 
the repeating of the edge detection procedure as pointed out in 
Embodiment 18 can be reduced, and thus the processing is made efficient. 

As in Embodiment 18, HL correction estimating means 705, LH 
correction estimating means 706 and HH correction estimating means 707 
find the respective difference images dHL2, dLH2, dHH2 between the edge 
image obtained by RC generating means 3601 and HL2, LH2, HH2 
obtained by leveling down means 701 (see FIG. 35 (c)), and each difference 
image is regulated to an image having ddLn picture elements x ddLn 
picture elements by linear approximation as in Formula 1. 

Meanwhile, edge generating means 3600 detects an edge image 
using a Laplacian filter from an original image of ddLn picture elements x 
ddLn picture elements regulated by II regulating means 12. And HL 
component estimating means 708, LH component estimating means 709 
and HH component estimating means 710 add correction images obtained 
by HL correction estimating means 705, LH correction estimating means 
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706 and HH correction estimating means 707 to the edge image whereby 
the respective sub-band components HL1, LH1, HH1 can be estimated with 
high precision. 

That way, it is possible to clearly estimate an enlarged image of (2 
x ddLn) picture elements x (2 x ddLn) picture elements. If interpolation 
and thinning out of a number of picture elements are added to that, an 
enlarged image having a desired size Ln picture elements x Ln picture 
elements can be obtained with high precision. 

In addition to estimation by HL component estimating means 708, 
LH component estimating means 709 and HH component estimating means 
710 as explained in Embodiment 18, it is possible to use the product of the 
correction amounts - by HL correction estimating means 705, LH 
correction estimating means LH correction estimating means 706, HH 
correction estimating means 707 - multiplied by a certain transform 
coefficient matrix or the results of transform by the transform function. 

Embodiment 20 

FIG. 37 shows the arrangement of the image processing device of 
Embodiment 20. 

The outline of the present embodiment is this. The number of 
picture elements to be enlarged is not known in advance, and an original 
image is enlarged twice both in the horizontal and vertical directions in 
accordance with multiple resolution analysis by Wavelet transform, and 
the enlarged image is shown to the user. And this process will be 
repeated until the user finds a desired one. 

An original image of a size n picture elements x n picture elements 
inputted by image input means 10 is set as an enlargement object image 
by EP initializing means 3700. Then, Obi enlarging means 3701 enlarges 



71 



the original image of a size n picture elements x n picture elements twice 
both in the horizontal direction and the vertical direction, that is, by four 
times. In this enlargement step, the enlargement object image can 
always be enlarged to a size of four times as many picture elements by 
5 using image enlargement means in the image processing devices described 
in Embodiments 17, 18 and 19. 

Enlarged image presenting means 1302 shows to the user the 
O current enlarged image obtained by Obi enlarging means 3701 on CRT etc. 

43 Providing a function of moving the visual point with a cursor etc. if the 

|J|10 resolution of the image is exceeded that of CRT etc., a function of cutting 
bj out a specific part from the image would help the user to judge if the 

J displayed enlarged image is just a needed one. 

Receiving the instruction from the user, MP ending judge means 
H 3703 refers the process to image fine-adjustment means 3704 if the image 

J^15 is of a desired size, and, if an indication is received that the size of the 
enlarged image is not a desired one, sets this enlarged image for next 
enlargement object image and returns the process to Obi enlarging means 
3701. 

Image fine-adjustment means 3704 asks the user if fine-adjustment 
20 is needed. Since multiple resolution analysis by Wavelet transform is 
used in image enlargement, the enlarged image is always four times as 
large as the image before the enlargement. The user may think that 
while the previous image is too small, the enlarged image is too large to 
display on CRT at a time. Image fine-adjustment means 3704 asks the 
25 user if the image size should be adjusted to some extent. If the user 
wants the image size to be enlarged a little, picture element interpolation 
is performed. If the user wishes to have the image slightly reduced in 
size, thinning out of picture elements will be performed. This way, the 
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image size is re-adjusted. 

For performing the picture element interpolation, an area - other 
than the edge - where the density change is small is selected. The same 
is the case with the thinning out. 

Instead of the interpolation and thinning out, there is a technique 
in which the original image is transformed in a frequency area and 
shortage components are added or excessive components are taken out 
from the high frequency, A proper technique is selected with the 
processing time, CPU capacity etc. taken into consideration. 

EsEI output means 1214 outputs an enlarged image obtained by 
image fine-adjustment means 3704, that is, to display the image on CRT 
etc., to print it out or refers it to other devices. 

According to the present embodiment, as set forth above, a detailed 
enlarged image obtained is shown to the user who judges if the size and 
resolution are just right. After the user finds the image a desired one, a 
series of enlargement steps can be suspended, and there is no need to set 
the enlargement ratio in advance and it is possible to enlarge an image 
simply to a size as desired by the user. 

Embodiment 21 

Finally, there will be explained the image processing device of 
Embodiment 21. The present embodiment is an invention related to 
efficiency of estimating an enlarged image of a color original image. 

FIG. 38 is a block diagram showing the arrangement of an image 
processing device of this embodiment, and there will be described the 
operation of this image processing device. 

First, from a color original image inputted by image input means 
10, SC selecting means 2800 selects the green component as standard color 



73 



component. And TR depriving means 2801 finds simple ratio ratio_r of 
the red component and ratio_b of the blue component to the green 
component. This process is the same as that in Embodiment 16 and will 
not be explained. 

Then, as the image processing devices in Embodiments 17, 18 and 
19 of the present invention, Standard Component Image (SCI) regulating 
means 3802 and Standard Image (SI) enlarging means 3803 enlarge the 
standard color component or the green component. And an enlarged 
image of the standard color thus obtained is subjected to picture element 
interpolation or thinning out by Standard Enlarged Image (EI) regulating 
means 3804 so as to obtain a desired image size Ln picture elements x Ln 
picture elements. Furthermore, Shortage Components (ShC) enlarging 
means 3805 multiplies the enlarged green component by the simple ratio 
ratio_r, ratio_b, thereby preparing data on the remaining red and blue 
components. 

Enlarged Color Image (ECI) recomposing means 3806 combines 
these three enlarged components into one, thus producing an enlarged 
image of the color original image. EsEI output means 1214 outputs an 
enlarged image obtained by image fine-adjustment means 3704, that is, to 
display the image on CRT etc., to print it out or refers it to other devices. 

Through that process, it is possible to simplify and speed up the 
processing, eliminating need of enlarging each of a plurality of components 
making up a color original image. 



74 



